Sitecore Blog: @sitecorejohn blog

Sitecore Digital Marketing System SQL Query to Populate Excel Funnel

By John West, August 29, 2011 | Rating:  | Leave a comment

This blog post describes my experience writing an SQL query against the database tables in the Sitecore Digital Marketing System (DMS) to support an analytics funnel created using Microsoft Excel. You can access a diagram of the DMS database schema in my dreamcore 2011 presentation, DMS Under the Hood.

One of our regional offices recently involved me in a project that required exporting data from DMS to an Excel spreadsheet to visualize as a funnel. I am not a database expert, but I have some experience with SQL and the DMS database.

Excel people always seem to want their data in a single table that violates database design principles, and this case was no exception: a single table showing home page hits and the estimated (relative) value of those visits broken down by date, with data for specific campaigns in the SAME single-table export, and data for specific events (for those events associated with campaigns) in the SAME table.

I determined that it would be more practical to solve this with SQL than APIs. In general, APIs sometimes make me appreciate SQL, but then SQL then makes me appreciate APIs again.

As usual, the hardest part was trying to understand the requirements. You wouldn't believe how much time it took to write three what I see as three queries connected using UNION, and I'm not sure how to validate the data. There were a couple of tricks though.

The totals for the home page were relatively straightforward. These queries are a little different than mine, which also filtered by site and date, but I think these should still work.

SELECT DISTINCT Visits.Multisite AS 'Site', 'Home' AS 'Item Name', Pages.ItemId AS 'Item ID', COUNT(Pages.PageId) AS 'Visits', SUM(Visits.Value) AS 'Value', CONVERT(VARCHAR, Visits.StartDateTime, 107) AS 'Date' 
FROM Pages, Visits
WHERE Pages.VisitId = Visits.VisitId AND
Pages.ItemId = 'cb6ce567-ad2e-4828-9fae-a0a640516661'
GROUP BY Visits.Multisite, CONVERT(VARCHAR, Visits.StartDateTime, 107), Pages.ItemId



Visits and value broken down by campaigns is a bit harder, but still not hard. DMS lets you associate multiple campaigns with each visit, so I couldn't just use CampaignID column in the Visits table. Instead, look for a record in the PageEvents table with the Data column equal to the ID of the campaign (replace <CAMPAIGN GUIDs> with the GUIDs of the campaigns of interest):

UNION SELECT DISTINCT Visits.Multisite, 'Campaign: ' + Campaigns.CampaignName, Pages.ItemId, COUNT(Pages.PageId), SUM(Visits.Value), CONVERT(VARCHAR, Visits.StartDateTime, 107) 
FROM PageEvents, Visits, Campaigns, Pages
WHERE Visits.VisitId = Pages.VisitId AND PageEvents.PageId = Pages.PageID AND
PageEvents.Data = '{'+CAST(Campaigns.CampaignId AS NVARCHAR(36))+'}' AND
PageEvents.Data in ('{<CAMPAIGN GUID 1>}','{<CAMPAIGN GUID N>}') 
GROUP BY Visits.Multisite, CONVERT(VARCHAR, Visits.StartDateTime, 107), 'Campaign: ' + Campaigns.CampaignName, Pages.ItemId



If a visit were associated with multiple campaigns, this would report it for both campaigns, which I assume is desirable.

Events associated with campaigns were the hardest, but still really not that hard. Because both goals and events are in the PageEvents table, the query has to check the page events table twice - find all visits associated with one of these events AND one of these campaigns. I have some background with Oracle, so I started with IN and IF EXISTS, but the performance was terrible. I ended up with the PageEvents table twice in the FROM clause:

UNION SELECT DISTINCT Visits.Multisite, 'Event: ' + PageEventDefinitions.Name, A.PageEventDefinitionId, COUNT(A.PageEventId), SUM(Visits.Value), CONVERT(VARCHAR, Visits.StartDateTime, 107)
FROM Visits, PageEvents A, PageEvents B, PageEventDefinitions
WHERE A.VisitId = Visits.VisitId AND
A.PageEventDefinitionId = PageEventDefinitions.PageEventDefinitionId AND
B.VisitId = Visits.VisitId AND 
B.Data in ('{CAMPAIGN GUID 1}','{CAMPAIGN GUID N}') AND 
PageEventDefinitions.PageEventDefinitionId in ('<event guid 1>','<event guid N>')
GROUP BY Visits.Multisite, 'Event: ' + PageEventDefinitions.Name, A.PageEventDefinitionId, CONVERT(VARCHAR, Visits.StartDateTime, 107)

ORDER BY 'Site', 'Date', 'Item Name', 'Item ID'



This performed remarkably well against a table with at least 1.5 million events.

If you can see how to use this approach to analyze a funnel, you can download this entire untested query. If you try to run this query, you will likely get the following error until you replace the <event guids> with lowercase GUIDs, without the curly braces ("{}" - use '1847ec31-e652-4eab-9918-eae4c644eb66' for example):

Msg 8169, Level 16, State 2, Line 1
Conversion failed when converting from a character string to uniqueidentifier.

Note a slightly different format and a different column for event GUIDs, and that this does not report events not associated with any of these campaigns, and would list the same visit twice if it were associated with two events, even if those events occurred on a single visit.

If you have any suggestions regarding this SQL, please comment on this blog post.

Tags: API, Architecture, Information Architecture, Infrastructure, Integration

*
*
*

Learn More with Sitecore

Newsletter
*

George School | Read Case Study >

We wanted to build a relationship with a company that would work with us moving forward, and we felt that Sitecore – more than any of the other vendors – was honest, open, and helpful.

- Ari Betof, Director of Enrollment Management & Strategic Planning, George School