Jun 6 2010

SAP BusinessObjects XI 3.1 Demonstration Walkthrough Part 1

This week I presented at the Mastering BusinessObjects 2010 conference, hosted in Melbourne, Australia.  The conference was organised by Eventful Management, who did a fantastic job.  International presenters included Timo Elliott, Ingo Hilgefort, Dave Rathbun, Rajeev Kapur and Tom Nather.  You can see some photos of the event by Timo here.

My topic was called ‘SAP BusinessObjects XI 3.1 End-to-End Demonstration’, in which I wanted to do a live demonstration of one dataset being used with most of the BOBJ tools, including Data Integrator, Universe Designer, Web Intelligence, Xcelsius, Live Office, QaaWS and Explorer.

Here is a link to the PowerPoint deck, though there’s not many slides.. (Fletcher_EndtoEndSBO_MBO)

I want to provide a walkthrough of the presentation, and include links to the datasets I used.  This should hopefully allow anyone (who has access to the SAP BusinessObjects XI 3.1 environment) to follow my steps.

To begin with, I introduced the dataset I used.  I downloaded it from http://data.australia.gov.au/, and it shows the Number of Crimes in New South Wales, Australia, broken down by Local Government Authority (LGA), Month/Year (over 12 years), Offence Category and Subcategory.  All in all it represents over 1.7 million records, so not a small dataset.  I also extended the dataset by introducing State Region (multiple LGA’s belong to a Region) in another CSV file.

To start with, the first part of the presentation was to use Data Integrator to import data from two CSV files into a SQL Server database, including de-normalising the data into a dimensional model.  You can download the CSV files from the Datasets page of my blog, under the NSW Crimes heading.  You will also need to create a database (I used SQL Server) to host the tables in the Target Datastore.

Here are the steps I following to build the ETL job to move the data from the CSV files to the SQL Server database.

  1. Create a new Flat File Format for the ‘NSW Crimes Data 95-08.csv’ (CSV#1) for extracting data
  2. Create a new Flat File Format for the ‘Regions and LGAs.csv’ (CSV#2) for extracting data
  3. Create a new database to store the extracted data in, and create a new Datastore (I called mine DEMO_E2E_SBO)
  4. Create a new Project and Job
  5. Within the new Job, create a new dataflow for staging the main source file, using a template table
  6. Run the job to stage the data
  7. Import the template table and set to overwrite
  8. Add a new dataflow for creating the DIM_DATE table, using the Date_Generation transform
    1. Set the date generation from 1995.01.01 to 2008.12.31, monthly
    2. Add the following output columns to the Query transform
      1. Date_SK (julian of Date)
      2. Year
      3. Quarter
      4. Month
      5. Date
  9. Add a new dataflow for DIM_LGA, using the CSV#1 flat file format as a source
    1. Add the following output columns to the Query transform
      1. LGA_SK (use gen_row_number function)
      2. State_Region (leave blank for now)
      3. Local_Government Authority
    2. Group by the LGA column
    3. Add in the CSV#2 as another source, combining in another Query transform, with the following changes
      1. Populate State_Region
      2. Make sure the WHERE clause includes LGA=LGA
  10. Create a new dataflow for DIM_OFFENCE_TYPE, using the following columns in the Query transform
    1. Offence_Type_SK (use gen_row_num to populate)
    2. Offence Category
    3. Offence Subcategory
    4. Make sure to GROUP BY Offence Category and Subcategory
  11. Run the job to create and load the dimension tables, then import each of the dimension tables and set them to overwrite values
  12. Create a dataflow for loading the FACT_CRIMES
  13. Add in a pivot transform, using the following settings
    1. Pivot sequence column called ‘Seq_Number’
    2. Non-pivot Columns
      1. LGA
      2. Offence Category
      3. Offence Subcategory
    3. Pivot Columns
      1. All month columns
    4. Data field column called ‘Number_of_Crimes’
    5. Header column called’ Month’
  14. Add in a query transform, with settings as follows
    1. Add the following columns
      1. Crime _SK (use gen_row_num to populate)
      2. Date_SK – use this mapping = julian(to_date(’01 ‘ || Pivot.Month,’DD MON YYYY’))
      3. LGA_SK – use lookup_ext function to populate with the LGA_SK from the DIM_LGA table
      4. Offence_Type_SK – use lookup_ext function to populate with the Offence_Type_SK from the DIM_OFFENCE_TYPE table
      5. Number_of_Crimes – direct mapping
  15. Run the job
  16. Import fact table and set to overwrite, and you’re done!

I’ve also exported the entire Target Datastore to an Access DB (also available on the the Datasets page of my blog), as well as exporting the entire job to an ATL file which you can also download.  I created the job in Data Services XI 3.2, so the ATL file won’t be compatible with versions prior to that.

I hope that someone finds the above of value if they are learning Data Services for the first time.  In the next post I will cover the rest of the presentation, so stay tuned!


Mar 6 2010

New Laptop

I’ve had my Macbook 13″ for over a year now, and have adapted to using OS X Snow Leopard with VMWare Fusion running multiple VM’s as required.  My job as the SBO Team Lead means that I provide BDM support (aka demo’ing BusinessObjects software), staying current with new releases and being a technical lead, as well as occasionally doing actual development work.

I’ve had several frustrations with optimising VMs to run for demos.  Pausing my VMs caused poor performance, and even cold-booting the VMs still didn’t perform as well as I’d like.  I attribute this mainly to the bottleneck caused by my 320GB 7200rpm internal hard drive.  Both my CPU and RAM seem to never max out while running VMs – current specs being 2.4 GHz and 4GB RAM.

I’ve been planning to upgrade for a while and I’ve finally signed the documents and ordered the new laptop.  I’m staying with MacBooks, but am going to try out Bootcamp this time.

Here’s the specs:

  • 15″ Apple MacBook Pro
  • 3.06 GHz CPU
  • 8GB RAM
  • 256GB SSD (Photofast G-Monster)
  • Matte screen (glossy screens = mirror)

The Photofast SSDs apparently sustain 270Mb/s reads and writes, so I’m expecting this baby to fly!  I could have stayed with a 13″, but I would only have been able to get a 2.53 GHz CPU and the screen size is also getting on my nerves recently.

In terms of the software setup, as I’ve had to reduce my hard drive space because of the cost of SSDs (this one is costing me AUD $1,450, ouch), I’m going to dedicate most of the drive to a Bootcamp partition, and the rest of the space will house a Snow Leopard OSX partition.

I’m planning to install Windows Server 2008 R2 x64, with all the usual workstation apps such as Microsoft Office, and this WinServer instance will also be joined to our corporate domain.  I’m also planning to install:

  • SQL Server 2008 (DBMS only)
  • Oracle 10G XE
  • SAP BusinessObjects Enterprise XI 3.1
  • SAP Data Services XI 3.1
  • Crystal Reports 2008
  • Xcelsius 2008

You may wonder why I’m only going with XI 3.1, and not 3.2?  Well (and listen clearly anyone working for SAP) as Edge XI 3.1 SP2 (or XI 3.2) hasn’t been released yet, all of our in-house developed Rapid Marts for clients aren’t backwards compatible.  And a lot of our clients are using Edge XI 3.1, so this means I need to ensure my development platform is compatible with most of our clients.  Frustrating, but necessary… (sidepoint – I heard from the BOB forum the other day that Edge XI 3.1 SP2 has been delayed till mid-2010!)

I’m going to setup the WinServer instance similar to the internal CDI image that SAP release, which means that all the services that run the DBMS, BOE and DS will all be stopped on startup (making sure the system boots very quickly), and then use batch scripts to start up separate components as required.  I will also start to compile multiple demo databases, universes, dashboards etc and hope to grow a large collection of demo material.

I will then setup OSX to use VMWare Fusion to use the Bootcamp partition, which will mean that most of the time I’ll still be able to be in OSX if I’m not demoing etc.

Apparently the timeframe for delivery is 7-10 days because of the matte screen and modded CPU.  I’m hoping to spend a weekend getting everything up and going – I’m really looking forward to it actually, kind of sad :) – then I’ll be looking to benchmark and take some video tests.  I’m expecting a beast of a machine, but as to how beefy, remains to be seen.

Stay tuned for more in the new few weeks…

- Josh


Oct 6 2009

Service Pack Frustrations

Service Pack 2 for SAP BusinessObjects Enterprise was recently released, and included some great new features like Web Intelligence Input Controls – see Coy Yonce’s blog post here (needs an SAP login).

However, my frustrations lie in that no SP2 for Edge XI 3.1 was announced.  Remembering back several years, service packs for BOE could be applied to Edge and even Crystal Reports Server, as they were all compatible.

However, it seems that the codebase has started to diverge, or the teams responsible for service packs aren’t coordinated across the different platform offerings.

As a consultant who deploys Edge frequently, this is frustrating considering that there are no discernible differences when using BO Edge compared to BO Enterprise.  Having no programming background at all, I may be swinging wild here, but it seems to me that using license keys to switch features on and off (like clustering, or Federation) would be simpler than diverging what is essentially the same codebase and maintaining separate service pack and fix packs.

I’m not across the success of SAP BusinessObjects Edge in the greater world community, however in Australia it is becoming a very successful platform, and I often need to tell our clients that they have to wait another (estimated) 6 months for a service pack, when the features they require are already available to Enterprise users.

Does anyone else see a need for coordinated delivery of service packs for the BI platform?  Or perhaps someone from SAP can weigh in on why they are delivered separately?

If there are good reasons, I’d love to hear them as most of the annoyance comes from the lack of communication.

– Josh