Intelligent Infrastructure

Mass data and next-gen workloads Intelligent Infrastructure

Dispelling misconceptions: Data reduction technology delivers only upsides to SSD performance

SSD Performance Upside

Ever since SandForce introduced data reduction technology with the DuraWrite feature in 2009, some users have been confused about how it works and questioned whether it delivers the benefits we claim. Some even believe there are downsides to using DuraWrite with an SSD. In this blog, I will dispel those misconceptions.

Data reduction technology refresher
Four of my previous blogs cover the many advantages of using data reduction technology like DuraWrite:

In a nutshell, data reduction technology reduces the size of data written to the flash memory, but returns 100% of the original data when reading it back from the flash. This reduction in the required storage space helps accelerate reads and writes, extend the life of the flash and increase the dynamic over provisioning (OP).


Click on diagram for expanded view.

 

What is incompressible data?
Data is incompressible when data reduction technology is unable to reduce the size of a dataset in which case the technology offers no benefit for the user. File types that are altogether or mostly incompressible include MPEG, JPEG, ZIP and encrypted files. However, data reduction technology is applied to an entire SSD, so the free space resulting from the smaller, compressed files increases OP for all file types, even incompressible files.

The images below help illustrate this process. The image on the left represents a standard SSD 256GB SSD filled to about 80% capacity with a typical operating system, applications and user data. The remaining 20% of free space is automatically used by the SSD as dynamic OP. The image on the right shows how the same data stored on a data reduction-capable SSD can nearly double the available OP for the SSD because the operating system, applications and half of the user data can be reduced in this example.

Identical data on both SSDs.

 

Why is dynamic OP so important?
OP is the lifeblood of a flash memory-based SSD (nearly all of them available today). Without OP the SSD could not operate. Allocating more space for OP increases an SSDs performance and endurance, as well as reduces it power consumption. In the illustrations above, both SSDs are storing about 30% of user data as incompressible files like MPEG movies and JPG images. As I mentioned, data reduction technology cant compress those files, but the rest of the data can be reduced. The result is the SSD with data reduction delivers higher overall performance than the standard SSD even with incompressible data.

Misconception 1: Data reduction technology is a trick
Theres no trickery with data reduction technology. The process is simple: It reduces the size of data differently depending on the content, increasing SSD speed and endurance.

Misconception 2: Users with movie, picture, and audio files will not benefit from data reduction
As illustrated above, as long as an operating system and other applications are stored on the SSD, there will be at least some increase in dynamic OP and performance despite the incompressible files.

Misconception 3: Testing with all incompressible data delivers worst-case performance
Given that a typical SSD stores an operating system, programs and other data files, an SSD test that writes only incompressible data to the device would underestimate the performance of the SSD in user deployments.

Data reduction technology delivers
Data reduction technology, like SandForce DuraWrite technology, is often misunderstood to the point that users believe they would be better off without it. The truth is, with data reduction technology, nearly every user will see performance and endurance gains with their SSD regardless of how much incompressible data is stored.

 

10 Comments

  • You posted the following on tomshardware:

    The SSD cannot tell what files have been deleted until the OS uses the same sectors to store new files, but by that time the SSD has already wasted cycles by garbage collecting data that was invalid, but known to the SSD.

    That information and the rest which you posted was very informative. In fact, after scowering the web for information on how SSDs actually work … you might be the only person that has cleared up some of my confusion.

    I posted an issue i have with a specific SSD here:

    http://forums.anandtech.com/showthread.php?p=36995934#post36995934

    I’m trying to understand what the drive is doing and why it is doing it. I was hoping i could pick your brain on the matter.

    Thank you for all the good information too.

    - Jared

  • Kent Smith Says:

    Hey Jared. I am happy to answer any questions you might have on SSDs. Let me know your question and I will see what I can do to help.

  • Essentially i have two SSD’s in a RAID 1 configuration that come loaded with a factory image. If i boot to a Windows XP x32 disk and image the drives, complete the OOBE (out of box experience), activate windows, and run small number of the most recent updates, and reboot. I get a blue screen. If immediately stop. Do that same process all over again i get NO blue screen.
    -
    It only happens on the first image of the drive when the factory Win7 image exists. Any subsequent reimaging causes no issues. Furthermore, it is the exact same file that gets corrupted every time (there could be other files which im not aware of but, this one is a core system file hence the blue screen).
    -
    At first i thought it was just a bad image or update but i’ve spent weeks testing and it appears to be the model of SSD (i’ve tried dozens of the same model). If i secure erase the drive first i have no issue. If i use a different brand of SSD i have no issue. Is the garbage collection routine on the SSD possibly confusing the file in question with the factory image file that is no longer in use but still on the drive somewhere? I would assume such a massive chunk of data was just shuffled around before it could erase that block of NAND flash?

    • I should emphasize that the problem only happens 1 time when this specific SSD has a factory image preloaded on it. The 2nd,3rd,4th… etc following the exact same steps doesn’t produce a blue screen.

  • I have to admit, after testing and studying SSD’s for the past month i am somewhat intrigued as to what is going on.

  • I am assuming that since Windows XP has no TRIM function as well the imaging software… that the data that exist on the drive (factory image) is either A) Not being moved or erased or B) Being moved but not erased. And that is somehow playing a role in the problem.

  • Kent Smith Says:

    Jared,

    This BSOD problem you are having with the Samsung 840 Pro is very interesting indeed. I see you confirmed changing the SSD to a different brand/model does not have the same issue, so I am assuming there is some strange interaction in the 840 Pro FW that you are hitting. It is likely such a rare combination that you hit, that Samsung has not seen it. I am interested in your test results after removing the RAID element from your config to see if that makes any difference.

    Also have you tried using something other than Ghost to create your images? In one of my personal system configs I ran into problems with Ghost on Windows 8, so I had to find a new application to create my images. I used Acronis True Image and it worked perfectly. I realize you are on Windows 7, but Ghost is one variable you have not mentioned swapping out to see if it is contributing to the problem.

    There is something I don’t understand in your procedure. You say only the first reimage will BSOD, but the second and subsequent reimage will work fine. What steps do you go through to get back to the first reimage? Or do you mean the BSOD only happens on brand new factory boxed Samsung 840 Pro SSDs?

    You said you can zero the drives and secure erase the drives and then you have no problem with BSOD. If you only see this problem on factory new 840 Pro SSDs, what about adding the zero or secure erase step first and solve this strange problem?

    Separately, let me see if I can address some of your other questions on how any SSD would deal with TRIM overall.

    1. Operating systems do not have a delete command because HDDs did not need it. HDDs can simply overwrite the old data. That is why SSDs needed TRIM. http://intelligent.media.seagate.com/2013/05/01/did-you-know-hdds-do-not-have-a-delete-command-that-is-why-ssds-need-trim/

    2. SSDs cannot write data directly to a previously written “page” without first being erased at the larger “block” level. Instead it keeps extra space available only to the controller (over-provisioning) and writes the new data at the new location and marks the original data in the original location as now invalid in its map. Eventually the controller must go through “garbage collection” to move the still valid data in the full blocks to new locations. I explain this in more detail in my blog http://intelligent.media.seagate.com/2013/04/04/gassing-up-your-ssd/ and the Flash Memory Summit presentation http://intelligent.media.seagate.com/files/2014/10/FMS-2012-Tutorial-E-21-Understanding-SSD-Overprovisioning-Kent-Smith.pdf

    3. When you secure erase an SSD, all controllers that I know about will clear out the physical to logical map. Only more extensive security protocols will actually erase all the flash locations immediately at the time the command is issued. The controller will keep track of blocks without any valid data and erase the blocks either as needed or as a background operation in advance of needing to write the new data to empty pages.

    4. Data on any storage device is “erased” by either overwriting the current location holding the data, or by the operating system no longer needing the current data. HDDs don’t care if the data is no longer valid or not, but SSDs do care because they have to garbage collect around all valid data.

    5. TRIM is a command sent by the operating system to tell the SSD that previously written data is no longer required. TRIM must be known in the OS and the SSD controller. Most SSD controllers sold today and current operating systems fully understand TRIM. However, most RAID environments are unable to pass the TRIM data from the OS to the storage device. Currently a few of the more simple RAID configurations are supported by some chipsets. Support of TRIM in a RAID environment is not a function of the SSD. If the RAID configuration supports TRIM, it should work for any SSD that also supports TRIM.

    6. SSDs don’t really care if a page has been previously written or not once it has been erased. Generally speaking most factory images on an SSD are nearly blank. Some testing will likely have been conducted across the flash blocks, but the controller will ignore anything previously written and just start from an assumed clean slate. Anything that would cause a difference that would result in BSOD would be a bug in the FW that should not be happening.

    7. Lack of TRIM operation in a system should not cause BSOD at any time. Lack of TRIM only means the SSD will be garbage collecting invalid data until the OS overwrites the same logical locations with new data. The net result of no TRIM is the SSD acts like it is at 100% capacity and only the native over-provisioning will be available during garbage collection. On an SSD with an operating TRIM command coming through, the controller will be able to use the free space not currently holding user data as dynamic over-provisioning.

    Let me know if your question on what happens to the original factory image is not answered in the data above and I can try explaining that portion differently.

  • I actually ordered a copy of Acronis today before reading this since i have the same concerns about it that you do.
    -
    To clarify what i mean by brand new. We have complete systems supplied to us that we install an image on. These systems come with two 840 Pro’s in RAID 1. At first i was just opening up boxes with new computers and testing. I must of went through about 8… and noticed each time the BSOD only happened the first time. Eventually i couldn’t keep opening up new boxes so i ended up making a raw image (using ghost) of the drives before we load our image. Then, i started using just one computer and applying the raw image first (to simulate a new drive) then applying our image. This also caused a BSOD as i expected.
    -
    As for just running the secure erase first then applying the image. This is something we added to our procedure today. Curious though, do you think this should be a standard practice when applying an image to any model SSD?
    -
    #3 – Is interesting. Was not aware that that was all secure erase is generally doing.
    -
    One are where i am confused is if 0′s below represent the factory image, is this what happens on an SSD when i apply my image (represented by 1′s):
    -
    00000000 11111111
    -
    On older drives i would expect the old image to be overwritten by my image. And it looks like you’re saying that happens on the SSD’s as well?
    -
    My knowledge on how the master file table, LBA’s, and the operating system interacts with the HDD is somewhat spotty. Any additional information on this you might have would be very helpful.
    -
    Thanks, Jared

  • Kent Smith Says:

    Secure Erase is good to start the SSD image process just to ensure the controller is not tracking any old data that the imaging application does not initially overwrite.

    The original data on the SSD is present until the OS overwrites the same logical addresses. With an image copy you are likely to replace it immediately, but even if part of the data is not immediately replaced, the new OS will not think anything is in those locations, so it will eventually try to write to those logical locations and the SSD will properly replace it then. Therefore you only have a marginal gain from secure erasing the data during the first full drive write worth of data. Then the TRIM command will manage getting back the dynamic free space to increase performance with subsequent full drive writes that you will reach on a constant basis.

    The FTL is very complex in the back end, but pretty simple on the front end. It is a simple logical-to-physical mapping table between the OS and the physical Flash memory. My FMS 2011 presentation goes into details on that http://intelligent.media.seagate.com/files/2014/09/FMS-2011-T1A-Garbage-Collection-Kent-Smith_Final.pdf

  • Thanks for clearing that up. And thank you for all the good information and help. Merry Christmas to you.

Post a Comment

Your email is never shared.

* Required fields

* Seagate will review all blog submissions and determine, in its sole discretion, whether such submissions will be posted for broader viewing. No blog comment will be considered for posting if deemed potentially damaging to Seagate's reputation or insufficiently aligned with the relevant blog topic. Without in any way limiting the foregoing, no submissions will be posted that contain: confidential company information; profanity; racial slurs; gratuitous references to sex, substance use, or violence; or statements that are in any way contrary to the letter or spirit of Seagate's Code of Business Conduct and Ethics.