Storing Large Property Values

SimpleDB limits the size of individual attributes to 1024 characters. This is sufficient for many purposes, and it's generally not recommended to store extremely large values in SimpleDB. But when you do need to store property values larger than 1024 characters you must split those property values across multiple attributes and also track their original order, since SimpleDB does not maintain consistent attribute order.

Simol will do this for you automatically when you mark string properties with the SpanAttribute. The example below demonstrates a persistent object that would let us store (short) books in SimpleDB:

    public class Book
        public Guid Id { get; set; }

        public string Content { get; set; }

Let's use this mapping to store the entire book of Acts (the longest book in the New Testament--about 140,000 characters) in SimpleDB. Here is the code:

    string content = File.ReadAllText(@"Resources\Acts.txt");
    Book book = new Book
        Content = content

After executing this code we would find that the Book domain contains an item with 139 attributes, each containing the following text:

Id Content
1a55e34b-0958-42df-bfad-8ff972c51e60 000Acts 1 Prologue 1 The former account I made, O Theophilus...
001n the Holy Spirit has come upon you; and you shall be witnesses...
002th one accord in prayer and supplication, with the women...

Simol uses the first three characters of each attribute to record the ordinal position of that "chunk" in the original property. Since SimpleDB allows up to 256 attributes of 1024 characters each for a single item, we can store books of up to 261,376 characters using this method.

Using Compression

Note that a boolean value of false is passed to the SpanAttribute on our book mapping (above). This tells Simol to store the property without using compression. If we change that boolean value to true Simol will GZip compress and Base64 encode Book.Content before storing it in SimpleDB. With compression enabled we can shrink the book of Acts from 140,000 characters down to 75,000 characters, or just 54% of it's original size.

Now if we examine our SimpleDB item again it will contain just 75 attributes and look like this:

Id Content
1a55e34b-0958-42df-bfad-8ff972c51e60 000H4sIAAAAAAAEAO29B2AcSZYlJi9tynt/SvVK1+B0oQiAYBMk2JBAEOzBiM3mkuwdaUcj...

Depending on the compression ratios achieved with your content, you may be able to store property values larger than 500 KB in a single SimpleDB item. But be careful when using compression with content that is highly randomized or already Base64 encoded, as compression may actually increase the size of your data in these cases. Property values smaller than 1,000 characters will not benefit much from compression either, due to the minimum compression overhead of about 160 characters plus the 35% overhead of Base64 encoding the compressed data.

Using Encryption

You can also use the SpanAttribute to encrypt properties stored in SimpleDB. Before you can store encrypted properties you must properly configure the IEncryptor instance used by Simol. IEncryptor is a simplified encryption interface defined by Simol. The default implementation is AesEncryptor, which performs 128-bit encryption using the .NET framework class AesCryptoServiceProvider.

AesEncryptor provides two static convenience methods for generating the key and initialization vector required by the AES algorithm. The key and IV must be stored and made available to your application for runtime encryption and decryption. The code below demonstrates the steps necessary to generate these values and properly configure the default encryptor.

    // generate a key and initialization vector
    var key = AesEncryptor.GenerateKey();
    var iv = AesEncryptor.GenerateIV();

    // configure the default encryptor
    var encryptor = (AesEncryptor)simol.Config.Encryptor;
    encryptor.Key = key;
    encryptor.IV = iv;

    // put an object with encrypted properties
    var book = new Book();

Last edited May 27, 2011 at 3:49 AM by ashleytate, version 23


jgill Sep 8, 2011 at 5:10 AM 
Hello, I am having trouble getting the encryption to work. In the example the encryptor is set then a new Book object is saved without setting any properties. When I set the Content property of the Book object before saving it is not encrypted when saved/stored/put into the domain. For some reason it is always in clear text. Am I missing something in the example? I also checked the tests in the source and could not find one that suceeds in saving an encrypted entity then decrypting it.

I was hoping that the encryption would work on values like an int or decimal value then encrypt that value and store it as a string in SimpleDb essentially then when obtained from SimpleDb it would decrypt the value and automatically cast it back to an int or decimal instead of forcing all properties on the Book to be strings or byte[].

Any help would be appreciated. Also I believe the AESEncryptor that is used defaults to 256-bit encryption in .NET instead of the 128 mentioned in the documentation (another plus for Simol and .NET).

ashleytate May 26, 2011 at 4:05 AM 
@shykat_games: There is very little overhead when using spanning unless you use compression with small values. The normal overhead is just 3 characters per attribute value from storing the value index number (e.g. 000, 001, etc.).

shykat_games Mar 22, 2011 at 6:46 AM 
I for one think this flexibility is cool, though I would only use it as needed. Is there an overhead in using spanning in the case that sometimes your values may be over 1k, while some may only be a few hundred bytes? I could see using this in those cases.

ashleytate Feb 15, 2010 at 8:09 PM 
I don't really see this as supporting "large" content. Would I personally dump data that is consistently 500kb per item into SimpleDB? Probably not. But there are often usage scenarios where 90% of your data is under 1kb and you have a few edge cases that go all the way up to 10 or 100kb. Or maybe you need to store data that is consistently just a little over the SimpleDB limit. This gives you some flexibility in those cases.

SoopahMan Feb 15, 2010 at 10:59 AM 
Amazon's docs recommend you dump large content like this into S3. You could store the path to the S3 object in the Attribute Value.

It might be interesting to tie this functionality directly into SimpleSavant.