<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>PowerPivotGeek &#187; data sources</title>
	<atom:link href="http://powerpivotgeek.com/tag/data-sources/feed/" rel="self" type="application/rss+xml" />
	<link>http://powerpivotgeek.com</link>
	<description>An adventure in managed self-service computing</description>
	<lastBuildDate>Wed, 19 Jan 2011 23:04:58 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>When is a refresh not a refresh?</title>
		<link>http://powerpivotgeek.com/2009/11/15/when-is-a-refresh-not-a-refresh/</link>
		<comments>http://powerpivotgeek.com/2009/11/15/when-is-a-refresh-not-a-refresh/#comments</comments>
		<pubDate>Mon, 16 Nov 2009 05:21:00 +0000</pubDate>
		<dc:creator>powerpivotgeek</dc:creator>
				<category><![CDATA[Data refresh]]></category>
		<category><![CDATA[Excel Services]]></category>
		<category><![CDATA[data sources]]></category>

		<guid isPermaLink="false">http://powerpivotgeek.com/?p=190</guid>
		<description><![CDATA[<p>Ok. This post will be a bit complicated but stick with me. Hopefully, in the end, all will be clear. And the geek in you will love it.</p>
<p>One of the things that users just kind of glance over, but don’t realize the implication, is the fact that PowerPivot is a copy of the data. If [...]]]></description>
			<content:encoded><![CDATA[<p>Ok. This post will be a bit complicated but stick with me. Hopefully, in the end, all will be clear. And the geek in you will love it.</p>
<p>One of the things that users just kind of glance over, but don’t realize the implication, is the fact that PowerPivot is a copy of the data. If you haven’t already, let me suggest that you read my <a href="http://powerpivotgeek.com/2009/11/09/a-peek-inside-wheres-the-beef/">&quot;Where&#8217;s the beef?&quot; posting</a>. In that posting I talked about the fact that <u><strong>data itself</strong></u> is pulled into the workbook when you save it. When you click any of these buttons:</p>
<p><a href="http://powerpivotgeek.com/wp-content/uploads/2009/11/image63.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="484" alt="image" src="http://powerpivotgeek.com/wp-content/uploads/2009/11/image_thumb64.png" width="630" border="0" /></a> </p>
<p>The import process runs. From then on, the data can start change and shift away from the values that is stored in memory and ultimately in the workbook. The data is ‘real-time’ only when the import is running; afterwards all calculations, pivot and slice is driven by the stored data. On the client this is clear because we have the ‘Refresh’ button (and its options) that provide refresh on the client. But how about the server?? Well, that is the core of this posting. Let’s take a closer look at it. We will start at the menu items for the Excel Services rendering of the workbook. Notice the options here:</p>
</p>
<p> <span id="more-190"></span>
</p>
<p><a href="http://powerpivotgeek.com/wp-content/uploads/2009/11/image64.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="379" alt="image" src="http://powerpivotgeek.com/wp-content/uploads/2009/11/image_thumb65.png" width="644" border="0" /></a> </p>
<p>The question is “What does it mean to ‘refresh’ the connection?” The answer to that is that it depends on the data provider. For virtually every OLEDB and ODBC provider that Excel Services uses, ‘refreshing a connection’ means going out to the data source and re-querying the data source for its data. SQL Server RDBMS, Oracle, Teradata, virtually to everyone it means refreshing that Excel Services data. And it means that in PowerPivot also, but in PowerPivot where is the data stored? (You know the answer this already, don’t you). The data is in the workbook. Has the workbook changed since you last opened the .xlsx file? Well, I suppose it might have – and in which case, refreshing the connection might bring in new data. But in the vast, vast number of cases, <em>refreshing the PowerPivot table means just re-reading the data that Excel Services already has</em>. In most cases, it has absolutely no effect at all.</p>
<p>To really drive this home, let’s shift into super-geek mode and drill down into the workbook itself. I will go back to the workbook in the first screen shot and first click on the Connections option in the Data ribbon. Notice that there is a connection that has been defined behind my back in the workbook. It is called “Sandbox” which by the way was the name of our system prior to Gemini and prior to PowerPivot. I didn’t create that connection. It was created for me when the PowerPivot Excel add in was first started. This is the connection which is actually interfacing to the in-memory database. Now let’s drilldown further into the “Sandbox” connection and look at its connection string. WOW! The “Data Source=” property, which would normally point to the server for where the database is stored, instead points to “<strong>$Embedded$</strong>” – What’s that?? </p>
<p><a href="http://powerpivotgeek.com/wp-content/uploads/2009/11/image65.png"><img title="image" style="border-right: 0px; border-top: 0px; display: inline; border-left: 0px; border-bottom: 0px" height="484" alt="image" src="http://powerpivotgeek.com/wp-content/uploads/2009/11/image_thumb66.png" width="644" border="0" /></a> </p>
<p><strong>$Embedded$</strong> is the magic tag that tells PowerPivot for SharePoint that the data does not come from some server somewhere – instead the data comes from the workbook itself. One of the new OLEDB interfaces created for PowerPivot is a property that Excel Services sets which contains the URL for the workbook that Excel Services is opening. The msolap OLEDB provider takes that URL and replaces the $Embedded$ string with the URL itself –&gt; and thus the infrastructure will read its data from the workbook itself.</p>
<p>But – and this is the critical “BUT” – notice that the embedded content never changes. After you upload a workbook, that workbook doesn’t change on its own. Thus neither does the data. Remember the data is a <strong><u>copy</u></strong> of the data that is embedded in a workbook. If Excel Services refreshes it, the ECS calc engine gets the same data over and over again. The SSAS database embedded in the workbook hasn’t changed – so the data refresh is a nop – it never changes. Refreshing a connection to an embedded PowerPivot database doesn’t refresh anything. You get the same data over and over again.</p>
<p>So, how does the workbook data get refreshed? After all, there must be some way to do it . . . In fact, there are two ways:</p>
<ol>
<li>Bring the workbook down on the client and refresh the data in the workbook. Then re-publish the workbook back to the same location in SharePoint. New data is automatically given to Excel Services and existing connections. </li>
<li>Use the data refresh facility, see the <a href="http://powerpivotgeek.com/misc/my-other-blog-articles/powerpivot-data-refresh/">data refresh posting</a> and <a href="http://powerpivotgeek.com/2009/11/12/steps-taken-during-a-powerpivot-data-refresh/">detailed steps posting</a> for more information. In this case the PowerPivot System Service will reach out and pull in new data into the workbook. A new version of the workbook is created and new data is automatically give to Excel Service and existing connections. </li>
</ol>
<p>And before you ask, <u>No</u>, PowerPivot V1 has no option to monitor the data in real-time and update its data in-memory as the source data changes. The workbook captures the data at a point in time – and then users work with that data. There are no provisions for real-time access to data while doing analytics / calculations / pivot table operations. </p>
<a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fpowerpivotgeek.com%2F2009%2F11%2F15%2Fwhen-is-a-refresh-not-a-refresh%2F&amp;linkname=When%20is%20a%20refresh%20not%20a%20refresh%3F"><img src="http://powerpivotgeek.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a>]]></content:encoded>
			<wfw:commentRss>http://powerpivotgeek.com/2009/11/15/when-is-a-refresh-not-a-refresh/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Steps taken during a PowerPivot data refresh</title>
		<link>http://powerpivotgeek.com/2009/11/12/steps-taken-during-a-powerpivot-data-refresh/</link>
		<comments>http://powerpivotgeek.com/2009/11/12/steps-taken-during-a-powerpivot-data-refresh/#comments</comments>
		<pubDate>Fri, 13 Nov 2009 05:08:55 +0000</pubDate>
		<dc:creator>powerpivotgeek</dc:creator>
				<category><![CDATA[Data refresh]]></category>
		<category><![CDATA[credentials]]></category>
		<category><![CDATA[data sources]]></category>
		<category><![CDATA[ULS]]></category>

		<guid isPermaLink="false">http://powerpivotgeek.com/?p=177</guid>
		<description><![CDATA[<p>In this posting we will take a more detailed technical look at how the data refresh facility works and the steps that it takes to accomplish a data refresh cycle. Rather than starting with the &#8220;Manage data refresh” page, we will assume that you know how to setup a schedule – in this posting, we [...]]]></description>
			<content:encoded><![CDATA[<p>In this posting we will take a more detailed technical look at how the data refresh facility works and the steps that it takes to accomplish a data refresh cycle. Rather than starting with the &#8220;Manage data refresh” page, we will assume that you know how to setup a schedule – in this posting, we will take a deep dive into the cycle itself.</p>
<h4>What steps are taken when the data is refreshed?</h4>
<p>Now that you have configured your schedule(s) for the workbook, let’s take a step back and examine more closely what data refresh actually means. I think that it is valuable to understand, at some basic level, exactly what the system is going to do on your behalf at 2am in the morning. When a job actually run, the data refresh facility goes through the following steps:</p>
<ol>
<li>First, the system looks for schedules that are ‘runable’ meaning that their schedule time period has come due. As all of the jobs might be scheduled at close to the same time (midnight, for example, is a popular time <img src='http://powerpivotgeek.com/wp-includes/images/smilies/icon_smile.gif' alt=':-)' class='wp-smiley' />  ), the system tries to run the job as soon as it can. All of the PowerPivot SharePoint servers are doing this at the same time. Ultimately one of them detects that your job has “come due” and is runable.</li>
<li>After impersonating the Windows user specified in the schedule, the system extracts the workbook from the content database using the SharePoint binary OM. The user must supply a valid Windows account in the schedule and he or she must ensure that that account has contributor (read/write) access rights to the workbook. The workbook is stored in a temporary folder (in the SSAS Backup folder) so it can be used later (see step 9 below).</li>
<li>The system extracts the embedded database from the workbook and loads the database into the local SSAS Engine instance. The database is loaded read/write (so it can be updated). This database is only used for this data refresh job – the system ensures that it is <span style="text-decoration: underline;">not</span> used for querying while updating is going on (the SSAS processing commands).</li>
<li>If a data source(s) specified for this schedule has custom data source credentials specified for the job, then the data source(s) have their connect string properties changed (in V1 we only support the changing of the “Username” and “Password” properties. This is done using an XMLA command to the data source.</li>
<li>The system impersonates the Windows user for a second time and sends processing commands to the database. This causes the Engine to reach out to the sources and pull updated data into the database. The processing command is not sent to all tables/dimensions. The process commands are sent just to those objects that are dependent on the data source(s) included in the schedule.</li>
<li>The data source credentials (if any) are reset.</li>
<li>The database is saved back to the workbook.</li>
<li>If it is not set already, the embedded connection’s property “Refresh data when opening the file” is set to True. This ensures that users immediately see the new data the workbook opened. It also means that snapshot generation will include the new data in the thumbnail.</li>
<li>Impersonating the Windows account yet a 3rd time, the workbook is saved back to the content database using the SharePoint binary OM. If the document library is a PowerPivot Gallery, then the OM fires its ‘new file’ event handler fires which starts the snapshot generation process. The “new file’ event handler was added by the Gallery content type.</li>
<li>The schedule’s status is updated with information about the job, i.e. its success, failure, error messages, etc.</li>
<li>And finally, the database is converted to a read-only database so it is available to users immediately for querying. This makes the user’s first query as fast as possible and lessens the load on the SharePoint content database since the PowerPivot database is already loaded into memory. </li>
</ol>
<p>The end result is that a new, updated workbook has been stored back in the original workbook’s document library – the overall system is primed and ready to go when the workbook is viewed.</p>
<p> </p>
<p><span id="more-177"></span>A few observations:</p>
<ul>
<li>Remember that to edit the schedule, you must enable it. I am always forgetting to check the “Enable” box at the top of the schedule. The radio buttons can still be selected if disabled, but the options will not expand. I cannot tell you how often I’ve sat staring at a page wondering what was wrong, only to realize that I forget to enable the schedule.</li>
<li>The schedule is kept independent from the workbook itself. It is stored in the service application database indexed by the SPFile.FileID. This uniquely defines a file on the SharePoint farm. A file can be deleted and its schedule remains. Publish a new file and the schedule automatically picks up.</li>
<li>The schedule history (success or failure results w/ error messages) is also kept in the service application database so it can remain a long time after the file has been deleted. While not available from the end-user’s UI (unless they recreate the file), the history information is available via the Mgmt Dashboard – so the information can be in a report (again, long after the file has been deleted).</li>
<li>An important point to remember: You specify the <span style="text-decoration: underline;">Windows user</span> here:  (one per schedule; pick your favorite method – one of the three)<a href="http://powerpivotgeek.com/wp-content/uploads/2009/11/image8.png"><img style="display: inline" title="image" src="http://powerpivotgeek.com/wp-content/uploads/2009/11/image_thumb9.png" alt="image" width="700" height="215" /></a>
<p>You specify the <span style="text-decoration: underline;">data source user</span> here: (one per data source; again, pick your favorite one of three methods)</p>
<p><a href="http://powerpivotgeek.com/wp-content/uploads/2009/11/image9.png"><img style="border-right-width: 0px; display: inline; border-top-width: 0px; border-bottom-width: 0px; border-left-width: 0px" title="image" src="http://powerpivotgeek.com/wp-content/uploads/2009/11/image_thumb10.png" border="0" alt="image" width="709" height="601" /></a></p>
<p>Get these two types of users mixed up and it will be <span style="text-decoration: underline;">very</span> confusing.</li>
<li>Troubleshoot: The ULS logs are your friend. Search Codeplex (<a href="http://www.codeplex.com">http://www.codeplex.com</a>) or your favorite SharePoint web site and pickup a good viewer. You will use it *a lot*.<br />
Another good ULS viewer is at: <a title="http://code.msdn.microsoft.com/ULSViewer" href="http://code.msdn.microsoft.com/ULSViewer">http://code.msdn.microsoft.com/ULSViewer</a> </li>
</ul>
<p>Enjoy.</p>
<a class="a2a_dd addtoany_share_save" href="http://www.addtoany.com/share_save?linkurl=http%3A%2F%2Fpowerpivotgeek.com%2F2009%2F11%2F12%2Fsteps-taken-during-a-powerpivot-data-refresh%2F&amp;linkname=Steps%20taken%20during%20a%20PowerPivot%20data%20refresh"><img src="http://powerpivotgeek.com/wp-content/plugins/add-to-any/share_save_171_16.png" width="171" height="16" alt="Share/Bookmark"/></a>]]></content:encoded>
			<wfw:commentRss>http://powerpivotgeek.com/2009/11/12/steps-taken-during-a-powerpivot-data-refresh/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>

