Ensuring Web Site Performance – Why, What and How to Measure Automated and Accurately
Posted: October 10, 2011 Filed under: ASP.NET, Performance | Tags: Ajax, App, Apps, ASP, Dynatrace, Monitoring, Notes, PDF, Perf, perfmon, Performance, UI, User Interface, Web Leave a comment »By dynaTrace
Using network analysis tools like HTTP Watch or Fiddler one can visualize the individual downloads in a timeline view
there are different stages of perceived performance and perceived response time.
The First Impression of speed is the time it takes to see something in the browsers window (Time To First Visual). We can measure that by looking at the first Rendering (Drawing) activity. Get a detailed description about Browser Rendering and the inner workings the Rendering Engine at Alois’s blog entry about Understanding Browser Rendering.
The Second Impression is when the initial page is fully loaded (Time To OnLoad). This can be measured by looking at the onLoad event which is triggered by the browser when the DOM is fully loaded meaning that the initial document and all embedded objects are loaded.
The Third Impression is when the web site actually becomes interactive for the user (Time To Interactivity). Heavy JavaScript execution that manipulates the DOM causes the web page to become non interactive for the end user. This can very often be seen when expensive CSS Selector Lookups (check out the blogs about jQuery and Prototype CSS Selector Performance) are used or when using dynamic elements like JavaScript Menus (check out the blog about dynamice JavaScript menus).
How to Speed Up sites by more than 50% in 5 minutes
Minute 1: Record your dynaTrace AJAX Session
Always turn on argument capturing
The reason I do that is because I want to see the CSS Selectors passed to the $ or $$ lookup functions from various JavaScript frameworks like jQuery or Prototype. The main problem I’ve identified in my work are CSS Selectors per className that cause huge overhead on pages with many DOM elements. I wrote two blogs about the performance impact of CSS Selectors in jQuery and Prototype.
Minute 2: Identify poorly performing pages
1. Having high JavaScript execution.
2. :arge amount of Rendering Time – that is time spent in the browser’s rendering engine
3. Page load times (time till the onLoad event was triggered) of more than 5 seconds!!
4. Very high Network Time although it doesn’t have a very bad page load time. This means that we have content that was loaded after the onLoad
Minute 3: Analyze Timeline of slowest Page
1. The script ?? takes xxx ms when it gets loaded
2. an XHR Request at the very beginning takes xxx ms
3. we have about 80 images all coming from the same domain – this could be improved by using multiple domains
4. we have calls to external apps like facebook, google ads or google analytics
Minute 4: Identify poorly performing CSS Selectors
they change the class name of the body to “en” which takes 550ms to execute.
The site makes heavy use of the CSS Selectors to look up elements by class name. This type of lookup is not natively supported by Internet Explorer and therefore jQuery has to iterate through the whole DOM to find those elements. A better solution would be to use unique IDs – or at least add the tag name to the selector string – this also helps jQuery as it first finds all elements by tag name (which is natively implemented and therefore rather fast) and then only has to iterate through these elements.
Minute 5: Identify network bottlenecks
The solution for this problem is using the concept of domain sharding. Using 2 domains to host the images allows the browser to use twice as many physical connections to download more images in parallel. This will speed up page the download of those images by 50%.
Besides the problems with CSS Selectors and Network Requests we see problems with poorly performing JavaScript routines (very often from 3rd party libraries), too many JavaScript files on the page, too many XHR (XmlHttpRequests) Requests to the server and slow responses from the server of those XHR Requests.
Optimizing Data Intensive Web Pages by Example
Using tools like dynaTrace AJAX Edition for Internet Explorer, YSlow or PageSpeed for FireFox or DevTools for Chrome enables automating web site performance measuring in manual and automated test environments.
URLs Measured Tab: check out ShowSlow. The site is really great. It combines YSlow and PageSpeed metrics and visualizes them in a really nice way.
I then took a close look how many URLs this page is displaying. Obviously ShowSlow really has become a victim of its own success – it is over 9000 URLs. So, did they do something wrong? Actually what happened is quite typical for an iterative development approach. You build an application that satisfies certain requirements and take care of further requirements later. Additionally, you normally don’t expect to be so successful so fast.
We can additionally put the style definition into a separate file which will be cached. This will reduce download size even more. Another interesting side effect of this change is the reduction in layout calculation time.
The problem we now have to solve is how to populate the table. If we would write all rows immediately via JavaScript this will definitely make our web page completely unresponsive. So I decided to render only the first 200 rows and then dynamically render an additional 100 rows when the user scrolls to the end of the page.
Andi wrote an interesting blog post about String concatenation problems in IE.
What are our takeaways? First, avoid repetitive markup on pages as much as possible. In our case, just getting rid of the style definitions helped us to reduce the payload by 1.7 MB. As we have seen this is especially important for data-driven sites, as each additional byte of content multiplies by the number of data items presented.
Secondly, you have to be careful with the markup-to-actual-data ratio. In this case it was rather low. Admittedly this will be different for other web pages; however, you always should keep this in mind. Especially for datadriven services you should consider only sending the actual data and build the markup on the client side. In this case you additionally have to be careful regarding JavaScript execution time.
From an application design perspective this means we have to separate our application into the classical three MVC layers.
• The model layer – meaning the “pure” data – which in our case was put into the JSON file.
• The controller layer – which in our case is the HMTL page itself and the JavaScript logic to dynamically manipulate the DOM.
• The view layer – in our case the style definitions – which are separated from all the other two layers.
Hands-On Guide: Verifying Web Site against Performance Best Practices
• Time to First Impression/Drawing:
- Recommendation: < 1s is great. <2.5s is acceptable
• Time to onLoad: 8.25s
- Recommendation: < 2s is great. <4s is acceptable
• Time to Fully Loaded: 8.6s
- Recommendations: < 2s is great. <5s is acceptable
• Number of HTTP Requests: 201
- Recommendations: < 20 is great. < 100 is acceptable (This one is a hard recommendation as it really depends on the type of website – but – it is a good start to measure this KPI)
• Number and Impact of HTTP Redirects: 1/1.44s
- Recommendations: 0. Avoid Redirects whenever possible
• Number and Impact of HTTP 400′s: 1/0.71s
- Recommendations: 0. Avoid any 400′s and 500′s
• Size of JavaScript/CSS/Images: ~370kb/220kb/890kb
- Recommendations: It is hard to give a definite threshold value. Keep in mind that these files need to be downloaded and parsed by the browser. The more content there is the more work on the browser. The goal must be to remove all information that is not needed for the current page. I often see developers packing everything in a huge global .js file. That might be a good practice but too often only a fraction of this code is actually used by the end-user. It is better to load what needs to be loaded in the beginning and delay load additional content when really needed
• Max/Average Wait Time: 4.31s/1.9s
- Recommendations: < 20ms is good. < 50ms is acceptable (as you can see – we are FAR OFF these numbers in this example)
• Single Resource Domains: 1
- Recommendations: 0. Try to avoid single resource domains. It is not always possible – but do it if you can
Usage of Browser Caching
Browsers can cache content such as images, javascript or css files. Caching elements greatly improves browsing behavior for revisiting users as they do not need to download the same resources again.
Network Resources and Transfers
This area of analysis focuses on unnecessary requests for all users (not just for revisiting users caused by wrong cache settings). The first types of requests that should be avoided are HTTP Redirects (300′s), Authentication Issues (400′s) and Server-Errors (500′s).
Application Server-Side Processing Time
• First request on the page -> usually returns the initial HTML
• Requests that return HTML -> generated content (this also may include static HTML pages)
• Requests on URL’s ending with aspx, jsp, php
• Requests that send GET or POST parameters data to the server
• All XHR/AJAX Requests
Here are some MUST READS:
• Best Practices from Google and Yahoo
• Blog from Steve Souders and John Resig (jQuery)
• dynaTrace Blogs on AJAX and Performance Almanac
Additionally you should check out How to Get Started with dynaTrace AJAX Edition and the Webinar we had with Monster.com to better understand how dynaTrace AJAX Edition can help you analyze your website.
Hunting Lost Treasures: Understanding and Finding Memory Leaks
Posted: October 10, 2011 Filed under: .NET, Java, Performance | Tags: alloc, allocations, App, Apps, CLR, Dispose, Dynatrace, Free, Garbage, GC, Heap, Java, JVM, mem, Memory, Monitoring, Notes, Optimizations, Optimize, PDF, Perf, perfmon, Performance, Runtime, Stack, Virtual Leave a comment »By dynaTrace
Memory leaks – together with inefficient object creation and incorrect garbage collector configuration – are the top memory problems
We need a heap analyzer for analyzing heap content and a console to collect and visualize runtime performance metrics.
A central concept in understanding the origins of memory leaks is Garbage Collection roots. A GC root is a reference which only has outgoing and no incoming references. Every object on the heap has at least one GC root. If an object is no longer referenced by a GC root it is marked as unreachable and ready for Garbage Collection.
There are three main types of GC roots.
• Temporary variables on stack of threads
• Static fields of classes
• Native references in JNI
Collections are the critical part here as they allow us to grow continuously over time, while holding an ever-increasing number of references. So this means that most memory leaks are caused by collections which are directly or indirectly referenced by static fields.
In memory analysis you will in this context often hear about the concept of dominators or the dominator tree. The concept of a dominator comes from graph theory and is defined as follows:
A node dominates another node if it can only be reached via this node. For memory management this means that A is a dominator of B if B is only referenced by A. A dominator tree is then a whole tree of objects where this is true for the root object and all referenced objects.
In case there are no more references to a dominator object all referenced objects will be freed up as well. Large dominator trees are therefore good candidates for memory leaks.
The general advice is to work with smaller heaps.
A good test is to create a heap dump from a production-sized instance and calculate the GC size of all HTTP sessions. In case you have problems solving this simple problem, you should either upgrade your tooling or decrease your heap size. Otherwise you might end up in a situation where you have no means to diagnose a memory leak in your application.
Why “top ten” Performance Reports are not the final answer
Posted: October 10, 2011 Filed under: Database, Performance | Tags: App, Apps, Databases, DBA, Dynatrace, Monitoring, Notes, Optimizations, Optimize, PDF, Perf, perfmon, Performance, SQL, Transactions Leave a comment »From Dynatrace
enrich monitoring data with additional context information like method parameters or bind variables of database statements, for example.
the average execution time.
This will lead us to statements or requests which are the slowest on average. Averages however might not be – and in many cases are not – the best metric to use.
Stats
A much better approach is to use percentiles instead. A common choice is the 95th percentile. This means the time in which 95 percent of all requests have been processed. This number is much closer to a representative performance value.
Instead of percentiles standard deviation is used frequently.
Sometimes, however, we might not be interested in the performance of a single statement or request but rather on the overall impact on our system. If for example we are suffering from massive CPU load, we want to identify the parts of the application which consume most of the CPU time. Here neither the average nor a percentile will be useful. Instead we will use the sum of all execution or CPU times.
Top ten reports do a good job in helping us to find the parts of an application which consume the most resources. So for all resource-oriented problems they are a great help.
We can use
- CPU time,
- network traffic or
- object creation
as metric, and then find the points of the application where optimization will have the greatest impact. For these kind of problems it is important to work with sum values only, as we are interested in global optimization.
What about using top ten reports to find the slowest web requests for
example?
Transactional contribution-based Optimization approach
For each database statement
1) in which transactions it was used, and
2) how much it contributes to the transaction response time.
Following this approach I can easily detect both; transactions having long running statements with high contribution to transaction response times and also fast statements which are executed frequently and respectively also having a high contribution to response times.
The many faces of end-user experience monitoring
Posted: October 10, 2011 Filed under: ASP.NET, Performance, Reference | Tags: Ajax, App, Apps, ASP, Dynatrace, Monitoring, Notes, PDF, Perf, perfmon, Performance, UI, User Interface, Web Leave a comment »(by DynaTrace)
End-user monitoring is also often referred to as the last mile in monitoring
The first question can be answered by tracking resource load times. Here we want to measure the following metrics:
• Time of First Byte – When was the first byte of the web page received
• Time to First Visual – When was the page content visible the first time.
This is the first time a drawing operation occurred
• Time to On Load – When was the onLoad event of the page executed
• Time to Page Ready – When all initial content is ready and JavaScript execution can safely start.
The network times should be split up into
- wait time (the delay until a browser connection available),
- DNS lookup time,
- transfer time and
- server time.
This is especially useful in diagnosing network related problems. Further we want to see HTTP Headers to find improper caching configuration or problems like HTTPS related connection problems.
Synthetic Transactions
Synthetic Transactions are based on the concept that they emulate real users. They use pre-recorded scripts defining end-user behavior that are executed at defined intervals. Providers of these tools execute these scripts from up to more than one hundred locations worldwide.
Some solutions replay recorded HTTP traffic patterns, others drive real browser instances. The advantage of the second approach is clearly that they need not emulate browser behavior. Especially in Web 2.0 applications which rely on the heavy usage of JavaScript and AJAX communication only a browser-based approach is feasible.
Network Sniffing
Network Sniffing is the second category of end-user monitoring tools. Unlike synthetic transactions they rely on real end-user traffic. Special appliances are used which monitor the whole network traffic being sent between clients and web servers. These appliances however are now within your own network, meaning they are farther away from your end users.
The disadvantage of this approach is that no browser-level metrics are collected. Problems caused by massive
- DOM access,
- excessive JavaScript execution or
- rendering problems
cannot be found at this level.
Instrumentation at Browser Level
Monitoring at the browser level is achieved by injecting JavaScript monitoring code into the page. The easiest way to do this is using header and footer injection. Small portions of JavaScript script code is injected at the beginning and the end of a page. This code will collect data for certain browser timings like first byte received, page completed or onLoad.
Examples are the Google Logging API for Speed Tracer which uses the console logging API of Webkit.
Episodes by Steve Souders suggests the use of beacons. Beacons are small web requests with piggybacked monitoring information. Alternatively XHR requests can be used to communicate with a monitoring server. Both approaches however use browser connections which are also required by the application itself.
Plain page timing metrics might not be enough for Google Mail and other single-page highly-interactive applications. Monitoring the response times of XHR requests is essential to understand the communication behavior of the application. The ZK framework’s performance monitor for example, offers such capabilities.
- JavaScript injection is easy to roll out however, it has limitations regarding the data it can provide.
- Browser plug-ins require explicit deployment while enabling the deepest insight into browser behavior – not to mention the roll out challenge.
- Network sniffers – while being able to capture and correlate the total traffic
- of a user – have no insight into the browser.
- Synthetic transactions basically serve a slightly different purpose.
4 Reasons for not Using Twitter to grab attention
Posted: September 14, 2011 | Author: Snapjudge | Filed under: Microsoft | Tags: Comments, FB, marketing, Mktg, Naked Pizza, Networks, News, Opinions, Scoble, Social, Thoughts, Twits, Twitter, WSJ | Leave a comment »There used to be Toll-free numbers with catchy words. Now it is the twitter handle. Every TV ad now ends with a Facebook link and a twitter ID; some of billboards just have the user-name in its display.
Today’s WSJ promotes Naked Pizza with an attractive title Finding New Investors, in 140 Characters or Less:
But, why Twitter is not the best method to find clients, leave alone investors?