<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:series="http://unfoldingneurons.com/"
	>

<channel>
	<title>SharePoint Magazine &#187; Performance</title>
	<atom:link href="http://sharepointmagazine.net/tag/performance/feed" rel="self" type="application/rss+xml" />
	<link>http://sharepointmagazine.net</link>
	<description>SharePoint Magazine is an online Magazine dedicated to the world of SharePoint</description>
	<lastBuildDate>Mon, 05 Jul 2010 09:14:11 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Load Testing Reveals Cause of SharePoint Server Performance Problem</title>
		<link>http://sharepointmagazine.net/technical/load-testing-reveals-cause-of-sharepoint-server-performance-problem</link>
		<comments>http://sharepointmagazine.net/technical/load-testing-reveals-cause-of-sharepoint-server-performance-problem#comments</comments>
		<pubDate>Wed, 30 Jun 2010 12:12:36 +0000</pubDate>
		<dc:creator>chrismerrill</dc:creator>
				<category><![CDATA[Case Studies]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[database]]></category>
		<category><![CDATA[Development]]></category>
		<category><![CDATA[Featured]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[poor performance]]></category>
		<category><![CDATA[sharepoint]]></category>
		<category><![CDATA[test design]]></category>
		<category><![CDATA[testing]]></category>
		<category><![CDATA[website capacity]]></category>

		<guid isPermaLink="false">http://sharepointmagazine.net/?p=4028</guid>
		<description><![CDATA[Although it seems obvious that adding hardware resources to a system should provide improved performance, a customized Microsoft Office SharePoint® Server (MOSS) website cluster showed the opposite in recent testing for a customer. Load testing with 200 simulated users gave disappointing results, with page durations between 10 and 30 seconds, and the system handling only about four pages per second. Curiously, reducing the cluster to a single SharePoint® server improved the performance.]]></description>
			<content:encoded><![CDATA[<p><strong>A New System&#8217;s Poor Performance</strong></p>
<p>Although it seems obvious that adding hardware resources to a system should provide improved performance, a customized Microsoft Office SharePoint® Server (MOSS) website cluster showed the opposite in recent testing for a customer. Load testing with 200 simulated users gave disappointing results, with page durations between 10 and 30 seconds, and the system handling only about four pages per second. Curiously, reducing the cluster to a single SharePoint® server improved the performance.</p>
<p><strong>Suspicions of Trouble</strong></p>
<p>Our customer, the Society for Human Resources Management, was worried. Their new web server system, intended as the primary interface to their 250,000 members, appeared to be too slow. In only four months it would be in production, and it was intended to provide improved work-flow and publishing features, as well as an enhanced customer experience.</p>
<p>Our initial load testing showed that their concerns were justified. The new site could not handle 200 simulated users, let alone the anticipated load of 1500 simultaneous users.</p>
<p>There should not have been any problem. The hardware and software provided plenty of capacity, with a cluster of four servers, each an HP VL360 G5 with 2 quad-core processors. Three of them were running SharePoint® Server on 16G of RAM, and the fourth ran the database server on 32G. The servers sat behind a Cisco CSS load balancer on a 45Mbps DS3 line. The database storage was an EMC Clarion CX-500 SAN and the web servers used only local disk storage. All servers were running Windows Server 2003 64-bit Enterprise SP2. The web servers ran Microsoft Office SharePoint® 2007 64-bit, and the SQL server ran SQL Server 2005, roll-up 8.</p>
<p><strong>Candidate Causes</strong></p>
<p>What could the problem be? The possibilities included the hardware – CPU, network, memory or disk, the software, and the software configuration. In particular, we were aware that we should suspect connection pools, thread pools, resource contention and database locking.</p>
<p><strong>Designing the Tests</strong></p>
<p>With the customer&#8217;s help, we proposed a handful of test cases to exercise about 500 pages from their site. This would give our load testing software, Web Performance Load Tester®, a repeatable interface to a relatively small subset of their entire site, a content-rich site with over 15,000 articles. A key consideration was to get rapid results on a very short deadline. We selected five test cases that exercised various navigation paths through the site.</p>
<p><strong>Initial Tests</strong></p>
<p>We were now ready to execute the first tests on the new site. Initial tests were not promising. Under a simulated load of 100 simultaneous users, the system returned pages, on average, in less than 3 seconds – but that was only after the first group of users had passed the homepage and login steps.</p>
<p><strong>Page duration greater than 20 seconds</strong></p>
<p>As we ramped up, the average page durations (APDs) peaked at over 12 seconds. After the second group of users was added, for a total of 200, average page durations exceeded 20 seconds, as shown in this chart:</p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_410dec95.png" alt="Initial testing shows poor performance" /><br />
<em>Figure 1: Initial testing shows poor performance &#8211; 20-30 second page durations at 200 users</em></p>
<p>During the test, Load Tester&#8217;s Server Monitoring Agents gathered metrics that indicated hardware was not the bottleneck. Neither CPU, memory or disk were taxed during the tests. Subsequent tests and investigations indicated that the network and load balancer were not the limiting factor either.</p>
<p><strong>Testing the cluster&#8217;s individual SharePoint® servers</strong></p>
<p>The next step was to isolate each SharePoint® web server in the cluster and test them individually. These tests revealed a number of differences between the servers. For instance, one server was not compressing the page content. More importantly, we found that running the site with only a single SharePoint® web server resulted in better performance! A single server gave average page durations under 6 seconds with up to 300 users. This was three times the capacity of the system running three web servers. (As you view this chart, note that the test ran for a shorter period than the previous one, with a resulting change in scale on the Users axis and the Time axis.)</p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_m5acf8b43.png" alt="A single server performed better" /><em>Figure 2: A single server performed better, but performance is still not acceptable</em></p>
<p><strong>CPU usage not scaling with applied load</strong></p>
<p>We also noted that CPU utilization was not scaling linearly with the applied user load. At about 400 users, the CPU utilization peaked on the web and database servers around 60% and 30% respectively.</p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_m739e4776.png" alt="CPU utilization levels off after 400 users" /><br />
<em>Figure 3: CPU utilization levels off after 400 users</em></p>
<p><strong>Hardware not the problem</strong></p>
<p>Additional user load did not raise these CPU levels. Indeed, CPU usage declined as more load was added. After the peak, additional load did not raise the key throughput metrics, such as hits/sec, pages/sec and bytes/sec. The server metrics did not indicate a bottleneck in any other hardware category (network, memory or disk), leaving software or software configuration as the most likely limiting factor. The most common culprits in this situation are connection pools, thread pools, resource contention and database locking. However, there was no indication in the test data that the pools or other resources were not configured correctly. Several DBAs had monitored the database server during the tests and none saw evidence of locking behavior. It was time to delve deeper into SharePoint-specific areas of concern.</p>
<p><strong>SharePoint® Tuning</strong></p>
<p>During the next series of tests, we focused on testing a single server, since there was little point in load testing and tuning a cluster of servers with the individual servers not operating up to their potential.</p>
<p>A number of optimizations to the SharePoint® configuration were suggested and implemented.</p>
<p>* We moved static resources (images, etc) to an image library to facilitate caching of the resources in the browser;<br />
* We changed SharePoint® cache settings to Extranet Publishing Site;<br />
* We changed the custom role provider to use Role Provider Caching;<br />
* We also changed the content Query Web Part to handle taxonomy more efficiently.</p>
<p>After each change was implemented, we measured the change in performance. In each test, an improvement in bandwidth utilization was observed, particularly between the SharePoint® servers and the database; however, the end-user performance was unchanged.</p>
<p><strong>Testing against an out-of-the-box installation</strong></p>
<p>Next we tried to determine whether the entire SharePoint® installation would share this performance profile, or if it applied only to the instance that was being tested. The customer created a new out-of-the-box SharePoint® site using one of the example sites. We tested this site to 1500 users, and observed only slight degradation at the peak. The test was very near or past the bandwidth limits of the network connection, which was a 45 Mbps DS-3.</p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_136987f9.png" alt="Average page durations are greatly improved" /><br />
<em>Figure 4: Average page durations are greatly improved &#8211; under 5 seconds up to 1500 users</em></p>
<p><strong>Investigating Authentication</strong></p>
<p>Now convinced that the OS, hardware and SharePoint® installation were healthy, we returned to the original site and targeted authentication. A new test case was designed that visited six public pages as an unauthenticated user. The system was tested and scaled to 1000 users, but performance was poor. Average page durations were in the 10 second range. The system was stable, but performance degraded rapidly by 1200 users, as we again hit the bandwidth limits.</p>
<p>Curious to see whether the improved results of previous test were due to a lower number of unique pages visited, rather than to authentication, we next designed a test case that visited a larger number of pages, both authenticated and not. This test included more pages than the first unauthenticated test, but a lot fewer than the original test scenario. This load test produced better performance, but was unstable, exhibiting a stalling behavior when under load. For example, the system ramped up to 1300 users serving about 30 pages per second, but as the test added further load, throughput suddenly dropped to fewer than 5 pages per second. We observed the same stalling behavior in multiple test runs at varying load levels.</p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_m72c0d01e.png" alt="System throughput scaled with load, then dropped to very low levels" /><br />
<em>Figure 5: System throughput scaled with load, then dropped to very low levels</em></p>
<p><strong>Adding one test case causes instability</strong></p>
<p>We next dissected the same test case into several iterations, to determine if any particular group of pages performed better or worse than others, but found no offenders. We then returned to a set of pages that did not require authentication, this time picking a larger set of pages containing a variety of features. There were 27 pages total. Load tests revealed the system could service these pages with average page durations under one second at 1500 concurrent users with consistent throughput of about 39 pages per second for two hours. Further experimentation revealed that the addition of one relatively simple test case caused the system to become unstable. Now we had an easy way to demonstrate how different usage patterns could yield good and bad performance of the system under the same configuration. We hoped this result would allow Microsoft SharePoint Support Engineers to offer some SharePoint-specific tuning advice.</p>
<p><strong>Rebooting the database server improves performance</strong></p>
<p>During some of the previous tests, we also noticed that system performance sometimes degraded consistently from one test to the next. We subsequently discovered that rebooting the database server between test runs temporarily improved performance. To help get consistency from the test results we began regularly rebooting all the servers prior to each test. This is actually a good test practice to ensure a consistent testing environment. Although we did not realize it at the time, the symptom of improved performance after rebooting was important, and later proved to be key to understanding the fundamental problems with the system.</p>
<p><strong>Reducing the number of processors improves performance</strong></p>
<p>After looking at our test results as well as collecting their own data, Microsoft SharePoint® Support indicated that SharePoint® was apparently unable to make use of such large hardware (8 processors with 16G of RAM). In an effort to validate that the problem was indeed caused by the large hardware, they recommended that we reduce the number of processors to 4, and then later suggested reducing it to 2. In each case, this resulted in a surprising performance improvement but the stalling behavior remained. Reducing the number of processors moved the point of failure, allowing the system to run longer before stalling, but did not cure the problem. We now had proof that the problem was unrelated to the size of the hardware and that it warranted more detailed, low level analysis.</p>
<p><strong>Database Tuning</strong></p>
<p>Early in the testing we had suspected that the database was the bottleneck. However, an analysis of database performance during the tests by both the customer&#8217;s in-house DBAs as well as Microsoft DBAs determined that locking contention was at low levels and the database was performing well. This had put the focus on the SharePoint® servers. It now seemed prudent to return our attention to the database.</p>
<p><strong>Contention in allocation</strong></p>
<p>After additional testing and data gathering, Microsoft Support engineers found that contention on tempdb allocations within SQL Server was causing delays processing queries from SharePoint®. This problem is described in the Microsoft Knowledge Base (#328551).</p>
<p>The fix required creating additional tempdb databases within SQL Server (one for each processor) and enabling a startup parameter (-T1118) that instructed SQL Server to use a round-robin tempdb allocation strategy. This change reduced resource allocation contention in the tempdb database, improving performance on complex queries.</p>
<p><strong>Performance improved but instability continues</strong></p>
<p>After making this change, load tests indicated that the system was able to sustain 15 pages per second at 650 users for 2 hours on a single server. Web page performance had improved, with average page durations down to the 2-4 second range. Specific changes to custom SharePoint® components and some additional database optimizations suggested by Microsoft Support brought average page durations under 1 second.</p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_d13458.png" alt="Page durations were greatly improved, but performance was not stable" /><br />
<em>Figure 6: Page durations were greatly improved, but performance was not stable</em></p>
<p>Although we had achieved a fast, stable system on a small subset of pages, the instability re-appeared when we re-introduced the remaining three test cases into the mix. The poor behavior appeared after roughly 80 minutes of operation at load. The failure was not as bad this time, and rather than stalling, the system&#8217;s throughput would suddenly drop by 30-50% and then oscillate up and down wildly.</p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_7ca273a8.png" alt="System throughput is good but degrades severely and unpredictably" /><br />
<em>Figure 7: System throughput is good, but then degrades severely and unpredictably</em></p>
<p><strong>Revisiting SQL Server and the rebooting fix</strong></p>
<p>We now found ourselves wondering whether the SharePoint® Server or the SQL Server was the culprit. We recalled our discovery in previous testing that rebooting the database fixed the problem and brought it to the attention of the Microsoft Support engineers.</p>
<p>We also found that if we stopped the load test when the servers were in a degraded state and restarted within a few minutes, the degradation would continue, even at very low load levels. Further diagnostics around these symptoms revealed that once the system performance had degraded significantly, clearing the query plan cache in SQL Server (via DBCC FREEPROCCACHE) would restore system performance almost immediately. Unfortunately the fix was not permanent, and performance degraded again within a short period of time.</p>
<p><strong>Single-threaded cache access in a multi-processor system</strong></p>
<p>These discoveries led the Microsoft engineers to a Microsoft Knowledge Base article (#927396) that indicated problems with the size of the TokenAndPermUserStore cache in SQL Server. When the server has a large amount of physical memory (in this case 32G) and the rate of random dynamic queries is high, the number of entries in this cache grows rapidly. As the cache grows, the time required to traverse and cleanup the cache can be substantial. Because access to this cache is single-threaded, queries can pile up behind each other waiting for the cleanup to complete. This queuing slows performance and prevents a multi-processor system from scaling as expected. The remedy was to start SQL Server with a “-T4618” parameter, which limits the TokenAndPermUserStore cache size. (This was not one of the solutions listed in the Microsoft Knowledge Base for this issue – it was provided by a Microsoft Support Engineer).</p>
<p><strong>Security Token Cache Size bug in SharePoint®</strong></p>
<p>After the cache-limit fix was applied to SQL Server, the next load test of the system showed steady performance with 15 pages/sec and APDs under 1 second, supporting 650 concurrent users for 10 hours. However, in a subsequent load test, errors reading “Arithmetic operation resulted in an overflow.” started appearing in the pages, indicating that SharePoint® was unable to render many web parts on the page. Microsoft quickly traced this to a bug in a SharePoint® cache implementation that was fixed by reducing the SharePoint® Security Token Cache size. Apparently object cache throws Integer Overflow exceptions when cache size is greater than 2000.</p>
<p>With the above fix applied and tested, the system was ready for a longer stress test to judge the stability of the system over longer periods. The next load test ran for 48 hours at 650 users. The system performed well – easily satisfying the performance requirement with only a single SharePoint® web server. No degradation of performance was observed. Further testing with all three SharePoint® servers and higher load levels showed similar success.</p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_49d303dc.png" alt="A successful 48-hour test at 650 users" /><br />
<em>Figure 8: A successful 48-hour test at 650 users</em></p>
<p><strong>Final Results</strong></p>
<p>Prior to stress testing and tuning the website, it could handle only 100 users (4 pages/sec). With the improvements it handled 2000 users (45 pages/sec, nearly 800 hits/sec) with low CPU utilization (About 20%) on the servers. For reference, if held for an entire day, this rate would result in nearly 3.9 million page hits per day.</p>
<p>At 2000 users, CPU utilization of the servers is below 25% – the customer&#8217;s Internet connection is now the only factor limiting total capacity. With a higher bandwidth connection, it is possible that this site could now service up to 8000 users.</p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_20bffea2.png" alt="2000 user test shows high throughput and steady performance" /><br />
<em>Figure 9: 2000 user test shows high throughput and steady performance</em></p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_m4b29f321.png" alt="2000 user test shows low page durations" /><br />
<em>Figure 10: 2000 user test shows low page durations</em></p>
<p><img src="http://sharepointmagazine.net/wp-content/uploads/2010/05/SHRMCaseStudy_htm_f0dfd4e.png" alt="Servers show low utilization at 2000 users" /><br />
<em>Figure 11: Servers show low utilization at 2000 users</em></p>
]]></content:encoded>
			<wfw:commentRss>http://sharepointmagazine.net/technical/load-testing-reveals-cause-of-sharepoint-server-performance-problem/feed</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>SharePoint Farm configuring and deployment. Part 1 &#8211; Architectural and Logical Planning</title>
		<link>http://sharepointmagazine.net/technical/administration/best-practices-of-sharepoint-farm-configuring-and-deployment-part-1-architectural-and-logical-planning</link>
		<comments>http://sharepointmagazine.net/technical/administration/best-practices-of-sharepoint-farm-configuring-and-deployment-part-1-architectural-and-logical-planning#comments</comments>
		<pubDate>Thu, 04 Jun 2009 01:00:17 +0000</pubDate>
		<dc:creator>Michael Nemtsev</dc:creator>
				<category><![CDATA[Administration]]></category>
		<category><![CDATA[Technical]]></category>
		<category><![CDATA[Architecture]]></category>
		<category><![CDATA[best practice]]></category>
		<category><![CDATA[Farm Design]]></category>
		<category><![CDATA[Performance]]></category>
		<category><![CDATA[SharePoint Farm]]></category>
		<category><![CDATA[Topology]]></category>

		<guid isPermaLink="false">http://sharepointmagazine.net/?p=2931</guid>
		<description><![CDATA[This series of articles provide an overview of how to plan, build and configure the common SharePoint farm across your organization.]]></description>
			<content:encoded><![CDATA[<ul>
<li><strong>Part 1 &#8211; Architecture and Logical Planning</strong></li>
<li>Part 2 &#8211; Installation</li>
<li>Part 3 &#8211; Development Environment</li>
<li>Part 4 &#8211; Backup and Recovery Strategy</li>
<li>Part 5 &#8211; Virtualization</li>
<li>Part 6 &#8211; Post Deployment (final)</li>
</ul>
<hr />
<h2>Overview</h2>
<p style="text-align: justify; ">Planning and installing SharePoint Farm across enterprise network is not a trivial task. SharePoint is rarely installed in an isolated environment, and usually it interferes with the organization strategy and existing infrastructure. Many factors may affect farm design, performance, scalability and redundancy &#8211; from hardware devices in organization network, to network topology. As a result, leveraging and finding compromises among those factors helps to build consistent, reliable and flexible environment.<br />
There are several <a href="http://technet.microsoft.com/en-us/library/cc262733.aspx" target="_blank">whitepapers on the Microsoft TechNet portal</a> describing requirements for SharePoint Farm, but most of them are either written without taking into account infrastructure scope or filled with irrelevant information that navigate the reader away from the problem scope.</p>
<p style="text-align: justify; ">In this document you will find the configuration recommendations regarding different SharePoint areas. All information is represented in the set of recommendations about different actions you need to undertake or pay additional attention when you install and configure your SharePoint environment. We tried to structure all section to follow the natural flow of SharePoint installation from the scratch &#8211; from pre-installing analysis requirements to post deployment actions.</p>
<p>We plan several whitepapers in our &#8220;Best Practices&#8221; series, and we are interested which topis you would like to see in our next SharePoint publications. Please send us your comments and suggestions via this <a href="http://msmvps.com/blogs/laflour/contact.aspx" target="_blank">form</a>.</p>
<h2>Introduction</h2>
<p style="text-align: justify; ">Organizations adopting SharePoint face a variety of tasks &#8211; from planning, strategy, infrastructure and architecture design, UI Design, migration, and to development. All these tasks imply flexible infrastructural baseline before actual work starts. However, in reality we face the outdated environment and misconfigured farms that are not ready to implement new requirements. In such cases, baseline architecture becomes foundation stone of all SharePoint projects.<br />
Why would we care about infrastructure and not about something else, for example development? Fixing infrastructure errors is very expensive task and leads into significant changes across SharePoint farm. For example, Index Role assigned to the wrong server and incorrectly configured Search can lead to performance and redundancy issues that might require up to 3 days fix. Development errors are not so expensive and can be fixed relatively quickly, but sometime such errors, eventually become infrastructure errors that lead to changes in infrastructure design.</p>
<p style="text-align: justify; ">
<p style="text-align: justify; "><img class="size-full wp-image-3381 alignleft" style="margin-left: 0px; margin-right: 25px;" src="http://sharepointmagazine.net/wp-content/uploads/2009/03/article1-planning.jpg" alt="article1-planning" width="449" height="278" /></p>
<h2>&#8220;Architectural Planning&#8221;</h2>
<p style="text-align: justify; ">Plan your farm and network communications before starting actual installation. The first thing to start is designing SharePoint architecture across corporate network. This includes understanding network structure, examining network devices and choosing the right SharePoint topology to fit the existing infrastructure and new requirements.</p>
<p style="text-align: justify; "><strong>Examine corporate network</strong></p>
<p style="padding-left: 30px; text-align: justify; ">Start from description of the existing network design, location of all applications and system servers. Microsoft Visio 2007 and &#8220;Network Diagram&#8221; template is a good instrument for this task.</p>
<p style="padding-left: 30px; text-align: justify; ">Record the location and information of corporate system servers, like Domain Controllers, File Servers, Mail Servers, Application and others. Dont&#8217; forget about network services &#8211; firewalls, proxies, and etc. For example, locations of ISA Servers across corporate network &#8211; IP address, list of open ports and the administrative user.</p>
<p style="padding-left: 30px; text-align: justify; ">The best way to maintain &#8220;Network Diagram&#8221; document is to update the single diagram that covers topology of all domains and how they are connected. The following diagram demonstrates the Visio document descibing the servers and devices across organization.</p>
<p style="padding-left: 30px; text-align: justify; "><img class="alignnone size-full wp-image-3033" src="http://sharepointmagazine.net/wp-content/uploads/2009/03/serverslocation.jpg" alt="serverslocation" width="557" height="403" /></p>
<p style="padding-left: 30px; text-align: justify; ">This diagram will give a holistic view of the existing topology and ensure quick access to information across different domains.</p>
<p style="text-align: justify; "><strong>Examine network devices</strong></p>
<p style="padding-left: 30px; text-align: justify; ">All network devices in the topology play a vital role of how SharePoint performs and interacts among different servers. Information about locations and settings of all routers, switches, and accelerators become very important in planning server locations. For example, location of different WAN and XML accelerators across network affects SharePoint server organization and configuration.</p>
<p style="padding-left: 30px; text-align: justify; ">Presence of different network devices affects the connection bandwidth and latency between farm&#8217;s servers, and thereby, affects the choice of appropriate SharePoint Farm topology. Network Load Balancers (NLB), routers and switches will affect how fast network response, therefore the farm should be designed with the least impact of these devices.</p>
<p style="padding-left: 30px; text-align: justify; ">Refer to the following links for the detailed information about WAN accelerators, NLB and other network devices across SharePoint farms:</p>
<ol>
<li><a href="http://technet.microsoft.com/en-us/library/cc263099.aspx">http://technet.microsoft.com/en-us/library/cc263099.aspx</a></li>
<li><a href="http://blogs.msdn.com/joelo/archive/2008/01/17/global-sharepoint-deployment-partner-solutions.aspx">http://blogs.msdn.com/joelo/archive/2008/01/17/global-sharepoint-deployment-partner-solutions.aspx</a></li>
<li><a href="http://blogs.msdn.com/joelo/archive/2007/01/05/nlb-network-load-balancing-and-sharepoint-troubleshooting-and-configuration-tips.aspx">http://blogs.msdn.com/joelo/archive/2007/01/05/nlb-network-load-balancing-and-sharepoint-troubleshooting-and-configuration-tips.aspx</a></li>
</ol>
<p style="text-align: justify; "><strong>Network administrator is a friend</strong></p>
<p style="padding-left: 30px; text-align: justify; ">The IT administrator is the person who should participate in farm configuration from the very beginning. This person will be responsible for the configuration of all network servers and devices across corporate network.</p>
<p style="padding-left: 30px; text-align: justify; ">Most of the SharePoint Farm topologies cross the bounds of domains and from the very beginning specific protocols and ports must be open. The best way to maintain current situation is to have a separate document, shared with administrator, with the description of protocols and ports to open across network services.</p>
<p style="padding-left: 30px; text-align: justify; ">Detailed information about system accounts and list of ports is available in the following articles:</p>
<ul>
<li>Plan for administrative and service accounts (Office SharePoint Server) <a href="http://technet.microsoft.com/en-us/library/cc263445.aspx">http://technet.microsoft.com/en-us/library/cc263445.aspx</a></li>
<li>Office SharePoint Server security account requirements <a href="http://go.microsoft.com/fwlink/?LinkID=92883&amp;clcid=0x409">http://go.microsoft.com/fwlink/?LinkID=92883&amp;clcid=0&#215;409</a></li>
</ul>
<p style="text-align: justify; "><strong>Measure network latency</strong></p>
<p style="padding-left: 30px; text-align: justify; ">Network response time is one of the important factors that can affect SharePoint farm design. Ideally, you need to measure the latency between SharePoint servers and users in order to reorganize servers according the smallest response time.</p>
<p style="padding-left: 30px; text-align: justify; ">Network latency is the key point to determine which of the proposed scenarios to implement in the current SharePoint deployment. (<em>Latency is the time required for a packet to travel from one point on a network to another</em>).</p>
<p style="padding-left: 30px; text-align: justify; ">Use the Ping tool (ping.exe) to measure latency for:</p>
<ul style="text-align: justify; ">
<li>users &#8211; from the client computer to the Web server on the server farm;</li>
<li>data centres that host servers of the same farm &#8211; from a Web server in the remote data centre to the database server in the primary data centre</li>
</ul>
<p style="padding-left: 30px; text-align: justify; ">Do not forget to divide the round-trip result by two, because all measures are one way only, not round-trip.</p>
<p style="padding-left: 30px; text-align: justify; ">Compare results to the data below, and adopt environment to have latency lower those values.</p>
<p style="padding-left: 30px;">
<table style="border-width: 1px" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="12%" valign="top"><strong>Number of users </strong></td>
<td width="13%" valign="top"><strong>Concurrent users (10%) </strong></td>
<td width="41%" valign="top"><strong>Central Solution </strong></td>
<td width="32%" valign="top"><strong>Distributed solution </strong></td>
</tr>
<tr>
<td width="12%" valign="top">100-5,000</td>
<td width="13%" valign="top">10-500</td>
<td width="41%" valign="top">Bandwidth:   3+ Mbps   (dual T1)Latency:   &lt; 100 ms</td>
<td width="32%" valign="top">Bandwidth:   1.5   Mbps (T1)Latency:   &lt;100   ms</td>
</tr>
<tr>
<td width="12%" valign="top">10,000</td>
<td width="13%" valign="top">1,000</td>
<td width="41%" valign="top">Bandwidth:       3+ Mbps (dual T1)Latency:   &lt;250   ms</td>
<td width="32%" valign="top">Bandwidth:   1.5   Mbps (T1)Latency:   &lt;500   ms</td>
</tr>
<tr>
<td width="12%" valign="top">100,000</td>
<td width="13%" valign="top">10,000</td>
<td width="41%" valign="top">Bandwidth:     3+ Mbps (dual T1)Latency:   &lt; 250   ms</td>
<td width="32%" valign="top"><span style="font-family: mceinline;"><em><strong><span style="font-family: mceinline;">Bandwidth:     1.5 Mbps (T1)</span></strong></em></span><span style="font-family: mceinline;"><em><strong><span style="font-family: mceinline;">Latency:     &lt;500 ms</span></strong></em></span></td>
</tr>
</tbody>
</table>
<p style="padding-left: 30px; text-align: justify; ">
<p style="padding-left: 30px; text-align: justify; ">The critical bandwidth for any SharePoint farms is 1.5 Mbps (T1) with 500ms latency. Overstepping these values will increase the page-load times dramatically, in 4 times at least. Refer to the diagrams in the &#8220;<a href="http://technet.microsoft.com/en-us/library/cc262952.aspx">Plan for bandwidth Requirements</a>&#8221; document, for more details about the bandwidth and latency results under different conditions.</p>
<p style="padding-left: 30px; text-align: justify; ">Available network bandwidth and latency influences geographic deployments significantly. Data transfers across WAN links that span multiple cities, states, provinces, countries, or continents requires really fast lines to provide adequate response time, so design such topologies thoroughly.</p>
<p style="padding-left: 30px;">More details for bandwidth requirements available in the following article <a href="http://technet.microsoft.com/en-us/library/cc262952.aspx">http://technet.microsoft.com/en-us/library/cc262952.aspx</a></p>
<p style="text-align: justify; "><strong>Become familiar with SharePoint farm communications</strong></p>
<p style="padding-left: 30px; text-align: justify; ">Before discussing servers&#8217; redundancy and farm topologies let us review farm servers and how they communicate with each other. The following picture from from &#8220;<a href="http://technet.microsoft.com/en-us/library/cc262400.aspx">Planning an Extranet Environment for Office SharePoint Server</a>&#8221; TechNet article illustrates the communication channels within a server farm and which servers handle client&#8217;s request.</p>
<p style="padding-left: 30px; text-align: center;"><img class="size-full wp-image-2975 aligncenter" src="http://sharepointmagazine.net/wp-content/uploads/2009/03/serverscommunications.jpg" alt="serverscommunications" width="350" height="229" /></p>
<p style="padding-left: 30px; text-align: justify; ">When a user issues a query, the query is sent to a Web server. The Web server communicates with the query server to build a list of results, and then communicates with the computer running Microsoft SQL Server to extend the list of results with summarization text, URLs, and security trimming. In parallel, the Web Server gets page data from SQL Server and renders them on fly. This diagram will help in understanding which roles to use on farm servers.</p>
<p style="text-align: justify; "><strong>Plan a baseline topology</strong></p>
<p style="padding-left: 30px; text-align: justify; ">Analyse the existing infrastructure and plan a SharePoint topology for redundancy. The term redundancy is often misinterpreted to be synonymous with availability.</p>
<p style="padding-left: 30px; text-align: justify;"><strong>Redundancy </strong>refers to the use of multiple servers in a load-balanced environment for any of several purposes, such as to improve farm performance, to scale out to accommodate additional users, and to improve availability.<br />
<strong>Availability</strong> is a more specialized concept that refers to a multiple-server environment that is designed to accept connections and operate normally even when one or more of the servers in the farm are not operational. Therefore, availability implies redundancy.</p>
<p style="padding-left: 30px; text-align: justify; ">There are several different topologies &#8211; from three to six servers in farm, which can be used as a baseline. Which one to choose depends on the level of redundancy and available hardware. Not all clients can afford topology with six or ten servers in farm due to budget limitation or data centre capabilities. Finding the compromise between numbers of servers, type of hosting and servers&#8217; roles become critical task, because this choice will affect performance and extensibility of the SharePoint farm for several years ahead.</p>
<p style="padding-left: 30px; text-align: justify; "><strong><em>Three Servers Farm</em></strong></p>
<p style="padding-left: 30px; text-align: justify; ">The minimum availability for the farm with few servers can be achieved with &#8220;3-servers farm&#8221; topology. In the current topology Web and Application Servers locate together on the one box and the database is on another box. The remaining, third, server gives a choice of which server role make redundant &#8211; Web server role or the database server role.</p>
<p style="padding-left: 30px; text-align: center;"><img class="size-full wp-image-2984 aligncenter" src="http://sharepointmagazine.net/wp-content/uploads/2009/03/threeserversfarm.jpg" alt="threeserversfarm" width="350" height="180" /></p>
<p style="padding-left: 30px; text-align: justify; ">The farm with the two Web Servers provides redundancy of the Web and Application roles, improving the overall performance. A drawback of this design is that your data is not redundant (left farm). In other case, farm with two Database Servers (cluster) provides data redundancy, increasing availability of critical data, but users might suffer from temporary loss of access, when Web Server unavailable (right farm).</p>
<p style="padding-left: 30px; text-align: justify; ">The &#8220;3-servers farm&#8221; is one of the most questionable farms in terms of redundancy and performance. This limitation in the number of servers cannot provide redundancy of Query Server and high performance at the same time.</p>
<p style="padding-left: 30px; text-align: justify; ">Redundancy can be achieved with Query Roles on both Web App servers. In this case, Database Server is the only place for Index Role, but this will hinder the overall performance. The Index Role is very CPU and HDD consuming role and that is why database servers are not very optimal place for this role. Alternative solution is to assign Index Role to the Web Server with the Query Role, but this will not work effectively, because in this case, index will not be propagated to another Query Server in farm.</p>
<p style="padding-left: 30px; text-align: justify; ">If performance is one of the priorities then consider using Query Server and Index Server Roles on different Web Application Servers. This is flexible design in terms of extensibility, because with the new servers in farm changing roles of Index and Query servers is not required.</p>
<p style="padding-left: 30px; text-align: justify; ">Interestingly, a TechNet article (<a href="http://is.gd/8QbS">http://is.gd/8QbS</a>, page 26) explains, that a Query Server can&#8217;t be used with Web Applications server for 3-servers farm. The reality is that, Web App and Query Role together are super common, more common than not (one of the reasons is that Query Server doesn&#8217;t use Network-Load Balancer &#8211; it uses its own algorithm).  What they actually mean in the TechNet article is that having the  Index on database server is not at all a recommended solution.</p>
<p style="padding-left: 30px; text-align: justify; "><em><strong>Four Servers Farm</strong></em></p>
<p style="padding-left: 30px; text-align: justify; ">Additional, forth server will add redundancy either for Data Server or for Web Server. However, it does not help much with performance. Current topology suffers from the same &#8220;3-servers farm&#8221; drawbacks &#8211; no place for Index Server with Query Role redundancy.</p>
<p style="padding-left: 30px; text-align: justify; "><em><strong>Five+ Servers Farm</strong></em></p>
<p style="padding-left: 30px; text-align: justify; ">The most common and highly available server farm topology is &#8220;5+ servers farm&#8221;, the farm with the middle tier server.</p>
<p style="padding-left: 30px; text-align: center;"><img class="size-full wp-image-2990 aligncenter" src="http://sharepointmagazine.net/wp-content/uploads/2009/03/fiveserversfarm.jpg" alt="fiveserversfarm" width="125" height="180" /></p>
<p style="padding-left: 30px; text-align: justify; ">
<p style="padding-left: 30px; text-align: justify; ">This middle tier server solves all issues of three and four servers topology by providing the dedicated tier for Index and Application roles. Additional servers in farm will extend middle tier, by assigning new roles to those servers &#8211; Excel Calculation Services Role, and Microsoft Office Project Server 2007 Role.</p>
<p style="padding-left: 30px; text-align: justify; ">The following table summarize farm topology:</p>
<table style="text-align: justify; border-width: 1px;" border="1" cellspacing="0" cellpadding="0">
<tbody>
<tr>
<td width="13%" valign="top">
<p align="center"><strong><em>Farm Servers</em></strong><em></em></p>
</td>
<td width="44%" valign="top">
<p align="center"><strong>Performance</strong></p>
</td>
<td width="42%" valign="top">
<p align="center"><strong>Redundancy</strong></p>
</td>
</tr>
<tr>
<td width="13%" valign="top">
<p align="center"><em>3 &#8211; 4</em><em></em></p>
</td>
<td width="44%" valign="top">Index on WFE with Query   on another box<em>App Roles   on WFE</em></td>
<td width="42%" valign="top">Index on Database, with   Query on WFE<em>App Roles   on WFE</em></td>
</tr>
<tr>
<td width="13%" valign="top">
<p align="center"><em>5</em><em></em></p>
</td>
<td width="44%" valign="top">App Roles on Middle Tier<em>Dedicated   Index Server on Middle Tier</em></td>
<td width="42%" valign="top">App Roles on WFE<em>Dedicated   Index Server on Middle Tier</em></td>
</tr>
<tr>
<td width="13%" valign="top">
<p align="center"><em>6</em><em></em></p>
</td>
<td width="44%" valign="top">Dedicated Web Server for   Crawling, outside NLB<em>Dedicated   Index Server on Middle Tier</em></td>
<td width="42%" valign="top">App Roles on Middle Tier   in NLB<em>Dedicated   Index Server on Middle Tier</em></td>
</tr>
</tbody>
</table>
<p style="padding-left: 30px; text-align: justify; ">
<p style="padding-left: 30px; text-align: justify; ">To optimize the overall performance of five and more servers SharePoint Farm, configure a dedicated Web Server for crawling content, especially when crawling a server farm that contains more than 500 gigabytes (GB) of content or crawling content over the WAN. To ensure that user requests are not affected by content crawling, remove the dedicated Web server from the network load balancing rotation. This is especially important in global environments in which the off-peak hours of a regional farm (when crawl jobs are likely to be schedule) coincide with the peak hours of the central farm.</p>
<p style="text-align: justify; "><strong>Plan extranet topology</strong></p>
<p style="padding-left: 30px; text-align: justify; ">Choose the topology based on requirements for external users. This topology will provide a basis of network extensibility for applications servers and communications between them.</p>
<p style="padding-left: 30px; text-align: justify; ">The simplest topology is &#8220;Edge firewall topology&#8221;, which is represented by following diagram, from TechNet article.</p>
<p style="padding-left: 30px; text-align: center; "><img class="size-full wp-image-2996 aligncenter" src="http://sharepointmagazine.net/wp-content/uploads/2009/03/edgetopology.jpg" alt="edgetopology" width="250" height="95" /></p>
<p style="padding-left: 30px; text-align: justify; ">This topology applicable for the small farms, when there is no need to separate internal services from corporate network and secure communications between server farms. All remote users are separated from farm by ISA server which plays a role of remote proxy.</p>
<p style="padding-left: 30px; text-align: justify; ">For the big farms, when security of communications is a priority, the recommended topology is &#8220;Back-to-back perimeter topology&#8221;. This is very flexible and adaptable topology for network changes.</p>
<p style="padding-left: 30px; text-align: center;"><img class="size-full wp-image-2997 aligncenter" src="http://sharepointmagazine.net/wp-content/uploads/2009/03/backendtopology.jpg" alt="backendtopology" width="400" height="255" /></p>
<p style="padding-left: 30px; text-align: justify; ">The main advantage of this topology is that it isolates the server farm in a separate perimeter network. Layers logically separate all servers and communications are under control &#8211; any security damages affect only specific layer, not the entire farm. External user access is isolated to the perimeter network and users can be isolated in different AD for remote and corporate access.</p>
<p style="padding-left: 30px; text-align: justify; ">There are some other extranet topology variations, but mostly all of them are based on &#8220;Back-to-back perimeter topology&#8221; with some modification.</p>
<p style="padding-left: 30px; text-align: justify; ">
<p style="padding-left: 30px; text-align: justify; ">Detailed information about farm topologies can be found in the following documents:</p>
<ol>
<li>Best practices for My Sites: <a href="http://technet.microsoft.com/en-us/library/cc262706.aspx">http://technet.microsoft.com/en-us/library/cc262706.aspx</a></li>
<li>Best practices for team collaboration sites: <a href="http://technet.microsoft.com/en-us/library/cc850694.aspx">http://technet.microsoft.com/en-us/library/cc850694.aspx</a></li>
<li>Planning an Extranet Environment for Office SharePoint Server: <a href="http://technet.microsoft.com/en-us/library/cc262400.aspx">http://technet.microsoft.com/en-us/library/cc262400.aspx</a></li>
</ol>
<h2>&#8220;Logical Planning&#8221;</h2>
<p><strong>Plan site collections</strong></p>
<p style="padding-left: 30px; ">Plan number of site collections and sub sites in advance &#8211; content, location, security.  Start with the single site collections and several sub sites rather then creating several site collections, and try to avoid new site collection if there are no requirements for this. The reason of such structure is that each new site collection works as a new application, with isolated scope to features, templates and search. Maintaining such structure is much easier than several site collections.</p>
<p><strong>Organize site collection across several content databases</strong></p>
<p style="padding-left: 30px;">Do not end up with one big content database, because data optimisation will cause troubles in this case. For the small and development environments, single content database might be a preferable choice. However, for the large farms create several content databases and organize site collections among them. Having several content databases with sites helps to address the following:</p>
<ul style="padding-left: 30px;">
<li>Keep content database size &lt;100 GB, otherwise it could hinder performance (MS recommendation)</li>
<li>Data usage optimization.</li>
<li>Simplify farm backup and restoration.</li>
<li>Flexibility for Disaster Recovery (DR) strategies.</li>
</ul>
<p style="padding-left: 30px;">More details about site collections in several content databases available in the following blog post: <a href="http://msmvps.com/blogs/laflour/archive/2008/10/14/tips-to-create-a-site-collection-in-new-content-database.aspx" target="_blank">http://msmvps.com/blogs/laflour/archive/2008/10/14/tips-to-create-a-site-collection-in-new-content-database.aspx</a></p>
<p><strong>Script actions</strong></p>
<p style="padding-left: 30px;">Prefer to script installation and SharePoint Farm configuration actions:  setting roles, creating web sites and site collections, etc. Configuring successful farm from the first attempt has a change to fail due to complexity of SharePoint. Running scripts to repeat all actions will save time when something went wrong and new server installation is required.</p>
<hr />In the next part we will review the actual SharePoint installation and the basic farm configuration.</p>
]]></content:encoded>
			<wfw:commentRss>http://sharepointmagazine.net/technical/administration/best-practices-of-sharepoint-farm-configuring-and-deployment-part-1-architectural-and-logical-planning/feed</wfw:commentRss>
		<slash:comments>7</slash:comments>
		</item>
	</channel>
</rss>
