Uploading a Dataset into Salesforce Analytics

One of the first steps when looking to utilise and work with Salesforce Analytics is investigating how to get data into the platform. There are a few ways to upload a dataset into Salesforce Analytics:

  1. ELT Platform - hosted on Salesforce Analytics instance the ELT tool is used to pull data from the Salesforce instance it is connected to
  2. Dataset UI - Via the UI you can goto the upload page for the Dataset and choose a local file to upload
  3. Dataset Utility - A Salesforce built JAVA utility that will use the Salesforce API to upload a dataset into Salesforce Analytics
  4. A Middleware or ETL tool - (Informatica, Mulesoft, Snaplogic) A third party integration tool of which many have connectors that will automatically integrate into Salesforce Analytics and can take current data being loaded and transformed and can push it into Salesforce Analytics.

The easiest way to get data into the Salesforce Analytics platform is to use the UI and upload a CSV through the Dataset Utility. Choose CSV under "Select a Data Source"

 
 

Once you have chosen to upload a CSV , you will be met with the dataset page where you can detail:

  • The CSV file to be uploaded.
  • A custom JSON file
  • Dataset Name
  • App - Where you would like to store the dataset

There is a current limit on the size of the CSV file you can upload via the UI process - 500MB. You can use a variety of other options for example the Dataset Utility , API, or Middleware tools if you require a larger dataset to be imported into Salesforce Analytics.

We will now look at the easiest of all the ways to upload a dataset into Salesforce Analytics, upload a CSV. The first step is to find an appropriate dataset to upload. Any CSV dataset is fine, one that is commonly used for testing and navigating through a variety of measures and attributes on the platform is a dataset called ontime. Ontime is a dataset including measures and attributes from the US airline industry. You can find this demo dataset here:

US Airlines Ontime Dataset

Once you have the dataset downloaded, you can go to the Dataset UI and choose to upload this dataset. Give the dataset a name on the left handside and also a App or place where to store the dataset. Your private app is a good place, this way no one can access the dataset automatically. When you start the uploading process you have the option to keep the data in a .ZIP format or upload as a CSV file. If you would like to have the measures and attributes configured for you , you will need to upload as a CSV file.

Once the file has been uploaded, it may take some time dependent on the size of the dataset. Navigate to where you stored the dataset and open it up. You will be met with your dataset with the results of how many rows of data are in the dataset. To verify that the measures and attributes have been generated by Salesforce Analytics, create a few groupings and measures to see if the data is correct. In the case of the dataset Ontime - Choose the following:

  • Sum of Flights
  • Sum of DepartureDelayMinutes
  • Group by Carrier

Looking at the results, you can see the total amount of minutes that American Airlines was delayed for during the total amount of flights held in the dataset. In this case it looks like the dataset has been uploaded and measures and attributes have been defined successfully.

 

Salesforce Analytics - On time Dataset - Measures and Attributes