Working with AWS S3 through C#


A while ago I needed to use AWS S3 (the Amazon’s cloud-based file storage) to store some files and then download them or get their listings through C#. As ironic as it sounds I noticed that there was no .NET implementation nor a documentation for S3 so I decided to create a file repository in C# which lets .NET developers access S3 programmatically.

Here is a rundown as to how you would work with AWS S3 through C#:

In order to use any APIs of Amazon Web Services (AWS) you will have to add the nugget package that is provided by Amazon. Simply bring up the Nuget Package Manager window and search for the keyword AWS. The first item in the search result is most likely AWS SDK for .NET which must be installed before you can access S3.

 

 

Once the SDK is installed we will have to find the properties of our S3 bucket and place it somewhere in web.config (or app.config) file. Normally these three properties of the S3 bucket is required in order to access it securely:

 

  1. Secret key
  2. Access key
  3. Region end point

These details will be provided to you by your cloud administrator. Here is a list of region end points that you can place in your configuration file (e.g. us-west-1)

 

Region name Region Endpoint Location constraint Protocol
US Standard * us-east-1 You can use one of the following two endpoints:

  • s3.amazonaws.com (Northern Virginia or Pacific Northwest)
  • s3-external-1.amazonaws.com (Northern Virginia only)
(none required) HTTP and HTTPS
US West (Oregon) region us-west-2 s3-us-west-2.amazonaws.com us-west-2 HTTP and HTTPS
US West (N. California) region us-west-1 s3-us-west-1.amazonaws.com us-west-1 HTTP and HTTPS
EU (Ireland) region eu-west-1 s3-eu-west-1.amazonaws.com EU or eu-west-1 HTTP and HTTPS
EU (Frankfurt) region eu-central-1 s3.eu-central-1.amazonaws.com eu-central-1 HTTP and HTTPS
Asia Pacific (Singapore) region ap-southeast-1 s3-ap-southeast-1.amazonaws.com ap-southeast-1 HTTP and HTTPS
Asia Pacific (Sydney) region ap-southeast-2 s3-ap-southeast-2.amazonaws.com ap-southeast-2 HTTP and HTTPS
Asia Pacific (Tokyo) region ap-northeast-1 s3-ap-northeast-1.amazonaws.com ap-northeast-1 HTTP and HTTPS
South America (Sao Paulo) region sa-east-1 s3-sa-east-1.amazonaws.com sa-east-1 HTTP and HTTPS

 

In order to avoid adding the secret key, access key and region endpoint to the <appSettings> part of your configuration file and to make this tool more organised I have created a configuration class for it. This configuration class will let you access the <configurationSection> element that is related to S3. To configure your app.config (or web.config) files you will have to add these <sectionGroup> and <section> elements to your configuration file:

 

<configSections>

<sectionGroup
name=AspGuy>

<section
name=S3Repository
type=Aref.S3.Lib.Strategies.S3FileRepositoryConfig, Aref.S3.LiballowLocation=true
allowDefinition=Everywhere />

</sectionGroup>

</configSections>

The
S3FileRepositoryConfig 
class is inherited from ConfigurationSection class and has properties that map to some configuration elements of your .config file. A sample configuration for S3 is like this:

 

<AspGuy>

<S3Repository


S3.ReadFrom.AccessKey=xxxxxxxxxx


S3.ReadFrom.SecretKey=yyyyyyyyyyyyyyyyy


S3.ReadFrom.Root.BucketName=-bucket-name-


S3.ReadFrom.RegionName=ap-southeast-2


S3.ReadFrom.RootDir=“”>

</S3Repository>

</AspGuy>

Note that <AspGuy> comes from the name property of <sectionGroup name=”AspGuy”> element. Also <S3Repository> tag comes from the name of <section> element. Each property of S3FileRepositoryConfig
is mapped to an attribute of <S3Repository> element.

Apart from SecretKey, AccessKey andBucketName you can specify a root directory name as well. This setting is there so you can begin accessing the S3 bucket from a specific folder rather than from its root, and obviously this setting is optional. For example imagine there is a bucket with the given folder structure:

  • Dir1
  • Dir1/Dir1_1
  • Dir1/Dir1_2

If you set the RootDir property to “” then when you call the GetSubDir methods of the S3 file repository if will return “Dir1” because Dir1 is the only top-level folder in the bucket. If you set the RootDir property to “Dir1” and then call the GetSubDirs method you will get two entries which are “Dir1_1” and “Dir1_2”.

Here is the code of the configuration class mentioned above:

 

For the repository class I have created an interface because of removing the dependency of clients (e.g. a web service that may need to use with various file storages) on S3. This will let you add your implementation of file system, FTP and other file storage types and use then through dependency injection. Here is the code of this interface:

 

In this interface:

  • Download: Downloads a file hosted on S3 to disk.
  • ChangeDir: Changes the current directory/folder to the given directory. If the new directory (relativePath parameter) starts with / then the path will be representing an absolute path (starting from the RootDir) otherwise it will be a relative path and will start from the current directory/folder.
  • GetFileNames: Retrieves the file names of the current folder
  • GetSubDirNames: Retrieves the name of folders in the current folder
  • AddFile: Uploads a file to S3
  • FileExists: Checks to see if a file is already on S3
  • DeleteFile: Deletes the file from S3

The implementation of these method are quiet simple using AWS SDK for .NET. The only tricky part is that S3 does not support folders. In fact in S3 everything is a key-value pair and the structure of entries is totally flat. What we do however is to use the forward slash character to represent folders and then we use this character as a delimiter to emulate a folder structure.

Untitled

 

 

 

You can clone the repository of this code which is on GitHub to play with the code. Feel free to send a pull request if you want to improve the code. The GitHub repository is located at https://github.com/aussiearef/S3FileRepository

 

 

 

 

 

 

Building high-performance ASP.NET applications


If you are building public facing web sites one of the things you want to achieve at the end of the project is a good performance under the load for your web site. That means, you have to make sure that your product works under a heavy load (e.g. 50 concurrent users, or 200 users per second etc.) even though at the moment you don’t think you would have that much load. Chances are that the web site attracts more and more users over time and then if it’s not a load tolerant web site it will start flaking, leaving you with an unhappy customer and ruined reputation.

There are many articles on the Internet about improving the performance of ASP.NET web sites, which all make sense; however, I think there are some more things you can do to save yourself from facing massive dramas. So what steps can be taken to produce a high-performance ASP.NET or ASP.NET MVC application?

  • Load test your application from early stages

Majority of developers tend to leave performing the load test (if they ever do it) to when the application is developed and has passed the integration and regression tests. Even though performing a load test at the end of the development process is better than not doing it at all, it might be way too late to fix the performance issues once your code has been already been written. A very common example of this issue is that when the application does not respond properly under load, scaling out (adding more servers) is considered. Sometimes this is not possible simply because the code is not suitable for achieving it. Like when the objects that are stored in Session are not serializable, and so adding more web nodes or more worker processes are impossible. If you find out that your application may require to be deployed on more than one server at the early stages of development, you will do your tests in an environment which is close to your final environment in terms of configuration and number of servers etc., then your code will be adapted a lot easier.

  • Use the high-performance libraries

Recently I was diagnosing the performance issues of a web site and I came across a hot spot in the code where JSON messages coming from a third-party web service had to be serialized several times. Those JSON messages were de-serialized by Newtonsoft.Json and tuned out that Newtonsoft.Json was not the fastest library when it came to de-serialization. Then we replaced Json.Net with a faster library (e.g. ServiceStack) and got a much better result.

Again if the load test was done at an early stage when we picked Json.Net as our serialization library we would have find that performance issue a lot sooner and would not have to make so many changes in the code, and would not have to re-test it entirely again.

  • Is your application CPU-intensive or IO-intensive?

Before you start implementing your web site and when the project is designed, one thing you should think about is whether your site is a CPU-intensive or IO-intensive? This is important to know your strategy of scaling your product.

For example if your application is CPU-intensive you may want to use a synchronous pattern, parallel processing and so forth whereas for a product that has many IO-bound operations such as communicating with external web services or network resources (e.g. a database) Task-based asynchronous pattern might be more helpful to scale out your product. Plus you may want to have a centralized caching system in place which will let you create Web Gardens and Web Farms in future, thus spanning the load across multiple worker processes or serves.

  • Use Task-based Asynchronous Model, but with care!

If your product relies many IO-bound operations, or includes long-running operations which may make the expensive IIS threads wait for an operation to complete, you better think of using the Task-based Asynchronous Pattern for your ASP.NET MVC project.

There are many tutorials on the Internet about asynchronous ASP.NET MVC actions (like this one) so in this blog post I refrain from explaining it. However, I just have to point out that traditional synchronous Actions in an ASP.NET (MVC) site keep the IIS threads busy until your operation is done or the request is processed. This means that if the site is waiting for an external resource (e.g. web service) to respond, the thread will be busy. The number of threads in .NET’s thread pool that can be used to process the requests are limited too, therefore, it’s important to release the threads as soon as possible. A task-based asynchronous action or method releases the thread until the request is processed, then grabs a new thread from the thread pool and uses it to return the result of the action. This way, many requests can be processed by few threads, which will lead to better responsiveness for your application.

Although task-based asynchronous pattern can be very handy for the right applications, it must be used with care. There are a few of concerns that you must have when you design or implement a project based on Task-based Asynchronous Pattern (TAP). You can see many of them in here, however, the biggest challenge that developers may face when using async and await keywords is to know that in this context they have to deal with threads slightly differently. For example, you can create a method that returns a Task (e.g. Task<Product>). Normally you can call .Run() method on that task or you can merely call task.Result to force running the task and then fetching the result. In a method or action which is built based on TBP, any of those calls will block your running thread, and will make your program sluggish or even may cause dead-locks.

  • Distribute caching and session state

    It’s very common that developers build a web application on a single development machine and assume that the product will be running on a single server too, whereas it’s not usually the case for big public facing web sites. They often get deployed to more than one server which are behind a load balancer. Even though you can still deploy a web site with In-Proc caching on multiple servers using sticky session (where the load balancer directs all requests that belong to the same session to a single server), you may have to keep multiple copies of session data and cached data. For example if you deploy your product on a web farm made of four servers and you keep the session data in-proc, when a request comes through the chance of hitting a server that already contains a cached data is 1 in 4 or 25%, whereas if you use a centralized caching mechanism in place, the chance of finding a cached item for every request if 100%. This is crucial for web sites that heavily rely on cached data.

    Another advantage of having a centralized caching mechanism (using something like App Fabric or Redis) is the ability to implement a proactive caching system around the actual product. A proactive caching mechanism may be used to pre-load the most popular items into the cache before they are even requested by a client. This may help with massively improving the performance of a big data driven application, if you manage to keep the cache synchronized with the actual data source.

  • Create Web Gardens

As it was mentioned before, in an IO-bound web application that involves quite a few long-running operations (e.g. web service calls) you may want to free up your main thread as much as possible. By default every web site is run under one main thread which is responsible to keep your web site alive, and unfortunately when it’s too busy, your site becomes unresponsive. There is one way of adding more “main threads” to your application which is achievable by adding more worker processes to your site under IIS. Each worker process will include a separate main thread therefore if one is busy there will be another one to process the upcoming processes.

Having more than one worker process will turn your site to a Web Garden, which requires your Session and Application data be persisted out-proc (e.g. on a state server or Sql Server).

  • Use caching and lazy loading in a smart way

    There is no need to emphasize that if you cache a commonly accessed bit of data in memory you will be able to reduce the database and web service calls. This will specifically help with IO-bound applications that as I said before, may cause a lot of grief when the site is under load.

    Another approach for improving the responsiveness of your site is using Lazy Loading. Lazy Loading means that an application does not have a certain piece of data, but it knows that where is that data. For example if there is a drop-down control on your web page which is meant to display list of products, you don’t have to load all products from the database once the page is loaded. You can add a jQuery function to your page which can populate the drop-down list the first time it’s pulled down. You can also apply the same technique in many places in your code, such as when you work with Linq queries and CLR collections.

  • Do not put C# code in your MVC views

    Your ASP.NET MVC views get compiled at run time and not at compile time. Therefore if you include too much C# code in them, your code will not be compiled and placed in DLL files. Not only this will damage the testability of your software but also it will make your site slower because every view will take longer to get display (because they must be compiled). Another down side of adding code to the views is that they cannot be run asynchronously and so if you decide to build your site based on Task-based Asynchronous Pattern (TAP), you won’t be able to take advantage of asynchronous methods and actions in the views.

    For example if there is a method like this in your code:

    public async Task<string> GetName(int code)

    {

    var result = …

    return await result;

    }

This method can be run asynchronously in the context of an asynchronous ASP.NET MVC action like this:

    public Task<ActionResult> Index(CancellationToken ctx)

{

    var name = await GetName(100);

}

But if you call this method in a view, because the view is not asynchronous you will have to run it in a thread-blocking way like this:

var name = GetName(100).Result;

.Result will block the running thread until GetName() processes our request and so the execution of the app will halt for a while, whereas when this code is called using await keyword the thread is not blocked.

  • Use Fire & Forget when applicable

If two or more operations are not forming a single transaction you probably do not have to run them sequentially. For example if users can sign-up and create an account in your web site, and once they register you save their details in the database and then you send them an email, you don’t have to wait for the email to be sent to finalize the operation.

In such a case the best way of doing so is probably starting a new thread and making it send the email to the user and just get back to the main thread. This is called a fire and forgets mechanism which can improve the responsiveness of an application.

  • Build for x64 CPU

32-bit applications are limited to a lower amount of memory and have access to fewer calculation features/instructions of the CPU. To overcome these limitations, if your server is a 64-bit one, make sure your site is running under 64-bit mode (by making sure the option for running a site under 32-bit mode in IIS is not enabled). Then compile and build your code for x64 CPU rather than Any CPU.

One example of x64 being helpful is that to improve the responsiveness and performance of a data-driven application, having a good caching mechanism in place is a must. In-proc caching is a memory consuming option because everything is stored in the memory boundaries of the site’s application pool. For a x86 process, the amount of memory that can be allocated is limited to 4 GB and so if loads of data be added to the cache, soon this limit will be met. If the same site is built explicitly for a x64 CPU, this memory limit will be removed and so more items can be added to the cache thus less communication with the database which leads to a better performance.

  • Use monitoring and diagnostic tools on the server

    There might be many performance issues that you never see them by naked eyes because they never appear in error logs. Identifying performance issues are even more daunting when the application is already on the production servers where you have almost no chance of debugging.

    To find out the slow processes, thread blocks, hangs, and errors and so forth it’s highly recommended to install a monitoring and/or diagnostic tool on the server and get them to track and monitor your application constantly. I personally have used NewRelic (which is a SAS) to check the health of our online sites. See HERE for more details and for creating your free account.

  • Profile your running application

    Once you finish the development of your site, deploy it to IIS, and then attach a profiler (e.g. Visual Studio Profiler) and take snapshots of various parts of the application. For example take a snapshot of purchase operation or user sign-up operation etc. Then check and see if there is any slow or blocking code there. Finding those hot spots at early stages might save you a great amount of time, reputation and money.

Web API and returning a Razor view


There are scenarios in which an API in a WEB API application needs to return a formatted HTML rather than a JSON message. For example we worked on a project where APIs are used to perform some search and return the result as JSON or XML while a few of them had to return HTML to be used by an Android app (in a web view container).

One solution would be breaking down the controller into two: one inherited from MVCController and the other one derived from ApiController. However since those APIs are in a same category in terms of functionality I would keep them in the same controller.

Moreover,  using ApiController and returning HttpResponseMessage lets us to modify the details of the implementation in future without having to change the return type (e.g. from ActionResult to HttpResponseMessage) and also would be easier for us in future to upgrade to Web API 2.

The advent of IHttpActionResult In Web API 2 allows developers to return custom data. In case you are not using ASP.NET MVC 5 yet or you are after an easier way keep reading!

To parse and return a Razor view in a WEB API project, simply add some views to your application just like when you do it for a normal ASP.NET MVC project. Then through Nuget, find and add RazorEngine which is a cool tool to read and parse Razor views.

Inside the api simply create an object to act as a model, load the content of the view as a text data, pass the view’s body and the model to RazorEngine and get a parsed version of the view. Since the api is meant to return HTML, the content type must be set to text/html.

Image

In this example the view has a markup like the one given below:

Image

As it’s seen, the model is bound to type “dynamic” which let’s the view accept a wide range of types. You can move the code from your API to a helper class (or anything similar) and create a function which accepts a view name, a model and then returns a rendered HTML.

The magic behind the Google Search


Have you ever thought that how does Google perform a fast search on a wide variety of file types? For example, how Google is able to suggest you a list of search expressions while you are typing in your keywords?

Another example is Google image search: You upload an image and Google finds the similar photos for you in no time.

The key to this magic is SimHash. SimHash is a mechanism/algorithm invented by Charikar (see the full patent). SimHash comes from the combination of Similarity and Hash. This means that instead of comparing the objects with each other to find the similarity, we convert them to an N-bit number that represents the object (known as Hash) and compare them. In the other words, if instead of the object, we maintain a number that represents the object, so that we will be able to compare those numbers to find the similarity of the two objects.

The basics of SimHash are as below:

  1. Convert the object to a hash value. (From my experience, this is better to be an unsigned integer number).
  2. Count the number of matching bits. For example, are the bit 1 in two hash values the same? Are the 2nd bits the same?
  3. Depending on the size of the hash value (number of bits), you will have a number between 0 and N, where N is the length of the hash value. This is called the Hamming Distance. Hamming distance was introduced by Richard Hamming in 1950 (see here).
  4. The number you have achieved must be normalized and finally be represented in a value such as percentage. To do so, we can use a simple formula such as Similarity = (HashSize-HamminDistance)/HashSize.

Since the hash value can be used to represent any kind of data, such as a text or an image file, it can be used to perform a fast search on almost any file type.

To calculate the hash value, we have to decide the hash size, which normally is 32 or 64. As I said before, an unsigned value works better. We also need to choose a chunk size. The chunk size will be used to break the data down to small pieces, called shingles. For example, if we decide to convert a string such as “Hello World” to a hash value, If the chunk size is 3, our chunks would be:

1-Hel

2-ell

3-llo

Etc.

To convert a binary data to a hash value, you will have to break it down to chunks of bits. E.g. you have to pick every K bits. Google says that N=64 and K=3 are recommended.

To calculate the hash value in a SimHash manner, we have to take the following steps:

  1. Tokenize the data. To tokenize the data, we will have to break it down to small chunks as mentioned above and store the chunks in an array.
  2. Create an array (called a vector) of size N, where N is the size of the hash (let’s call this array V).
  3. Loop over the array of tokens (assume that i is the index of each token),
  4. Loop over the bits of each token (assume that j is the index of each bit),
  5. If Bit[j] of Token[i] is 1, then add 1 to V[j] otherwise subsidize 1 from V[j]
  6. Assume that the fingerprint is an unsigned value (32 or 64 bit) and is named F.
  7. Once the loops finish, go through the array V, and if V[i] is greater than 0, set bit i in F to 1 otherwise to 0.
  8. Return F as the fingerprint.

Here is the code:

private int DoCalculateSimHash(string input)

{

ITokeniser tokeniser = new Tokeniser();

var hashedtokens = DoHashTokens(tokeniser.Tokenise(input));

var vector = new int[HashSize];

for (var i = 0; i < HashSize; i++)

{

vector[i] = 0;

}

foreach (var value in hashedtokens)

{

for (var j = 0; j < HashSize; j++)

{

if (IsBitSet(value, j))

{

vector[j] += 1;

}

else

{

vector[j] -= 1;

}

}

}

var fingerprint = 0;

for (var i = 0; i < HashSize; i++)

{

if (vector[i] > 0)

{

fingerprint += 1 << i;

}

}

return fingerprint;

}

And the code to calculate the hamming distance is as below:

private static int GetHammingDistance(int firstValue, int secondValue)

{

var hammingBits = firstValue ^ secondValue;

var hammingValue = 0;

for (int i = 0; i < 32; i++)

{

if (IsBitSet(hammingBits, i))

{

hammingValue += 1;

}

}

return hammingValue;

}

You may use different ways to tokenize a given data. For example you may break a string value down to words, or to n-letter words, or to n-letter overlapping pieces. From my experience, if we assume that N is the size of each chunk and M is the number of overlapping characters, N=4 and M=3 are the best choices.

You may download the full source code of SimHash from SimHash.CodePlex.com. Bear in mind that SimHash is patented by Google!

Single Sign On with WCF and Asp.NET Custom Membership Provider


I recently was involved in designing enterprise software which contained several ASP.NET web sites and desktop applications. Since same users would use this software system a single-sign-on feature was necessary. Thus I took advantage of Microsoft Identity Foundation which is based on Claim Based Authentication. Also as the applications would be used by internal users, I used Active Directory Federation for authenticating users against an existing Active Directory.

Later I though what would be the solution if Claim Based Authentication did not fit in the solution? So I decided to design and implement a simple single-sign-on (SSO) authentication with WCF. This SSO has the following specifications:

  • Only performs Authentication. It does not contain any role management capability. The reason is that roles and access rights are defined in the scope of each application (or sub-system) so it is left to be done by applications.
  • It is based on Web Services so it can be consumed any technology that understands Web Services (e.g. Java apps.)
  • It can support many kind of user storage, such as Active Directory, ASP.NET Application Services (ASPNETDB) , Custom Authentication etc.
  • It can be used by Web, Desktop and Mobile applications.

Figure 1, Components of SSO

As seen in figure 1, user information can be retrieved from Active Directory, ASP.NET Application Services, Custom Database or 3rd Party Web Services. Components that encapsulate the details of each user storage or services are called Federations. This term is used by Microsoft in Windows Identity Foundation so we keep using that!

The Single Sign-On Service relies on Federation Services. Each federation service is simply a .NET Class Library that contains a class which implements an interface which is common between itself and SSO service. A federation module is plugged into SSO service via a configuration file. The Visual Studio solution that is downloadable at the bottom of this post includes a custom federation module that uses SQL Server to store the user information.

Client applications can consume the Web Service directly to perform sign-in, sign-out, authentication and other related operations. However, this solution includes a custom ASP.NET membership provider which allows ASP.NET applications consume SSO with no hassle. It also enables the existing ASP.NET applications to use this SSO service with a small configuration change.

Figure 2, Package diagram of SSO

Classes and Interfaces

The key classes and interfaces are as below:

  • ISSOFederation Interface: is implemented by Federation classes.
  • AuthenticatedUser: Is used by the federation classes and SSO service to represent a user. This type is also emitted by the service to the clients.

Figure 3, Key types of Common package

  • CustomFederation: Implements ISSOFederation and encapsulates the details of authentication and user storage.
  • SSOFederationHelper (SignInServices package): Provides a service to load the nominated federation service. Only one instance of the federation object exists (Singleton) so to plug another federation module the service must be restarted (e.g. restart the ISS web site).

Figure 4, SSOFederationHelper class

A federation object is plugged to the service using reflection. To do so, first the Fully Qualified Name of the federation type is placed in the Web.Config file:

<appSettings>

<add
key=federationType
value=SignInServices.CustomFederation.SSOCustomFederation,SignInServices.CustomFederation/>

</appSettings>

 

 

This type is instanciated by reflection and provided to the consumers by SSOFederationHelper.FederationObject property. For example:

 

public
List<AuthenticatedUser>
FindUsersByEmail(string
EmailAddress)

{


return
SSOFederationHelper.FederationObject
.FindUsersByEmail(EmailAddress).ToList();

}

 

  • SSOMembershipProvider (SSOClientServices package) is also a custom Asp.net membership provider. This class enables the ASP.NET applications take advantage of the SSOService without being dependent on it. The key point here is that all the ASP.NET applications that take advantage of SSO service must give a same name to the Asp.net membership cookie. Example:

<authentication
mode=Forms >

<forms
loginUrl=~/Account/Login.aspx
timeout=2880” name=.SSOV1Auth />

</authentication>

<authorization>

 

The custom membership provider has a proprietary app.config file. This configuration file includes the WCF client configuration (e.g. binding configuration). However, the URI of the service is configured in the Web.config file of the ASP.NET client. For example, an ASP.NET client configures the membership providers as below:

<membership
defaultProvider=SSOMembershipProvider>

<providers>

<clear/>

<add
name=SSOMembershipProvider
type=SSOClientServices.SSOMembershipProvider,SSOClientServices


endPointUri=http://localhost/SSO/service/sso.svc


enablePasswordRetrieval=false
enablePasswordReset=true
requiresQuestionAndAnswer=false
requiresUniqueEmail=false

/>

</providers>

</membership>

 

The value of endPointUri entry will be used to configure the service proxy.

 

The Visual Studio solution

The source code provided here has been developed using Visual Studio 2010 and requires the following components be installed:

  1. Visual Studio 2010 Express Edition
  2. Entity Framework 4.0
  3. WCF
  4. C# 4.0
  5. IIS 7
  6. SQL Server 2008 or 2008 Express

The following Visual Studio (C#) projects are included:

  1. SignInServices.Common: Includes common types
  2. SignInServices.CustomFederation: Is a sample Federation provider
  3. SignInServices: The Single Sign On WCF Service
  4. SSOClientServices: Includes a custom ASP.NET membership provider
  5. TestWebSite: An ASP.NET web site to test the SSO

How to download the source code?

The source code has been uploaded to CodePlex. Please go to http://wcfsso.codeplex.com and download the latest change set.

 

How to deploy?

In order to deploy the solution take the following actions:

  1. Restore the SQL Server 2008 Database in a SQL Server 2008 Server under the name of Framework
  2. Create a Windows login in SQL Server 2008 and grant access to the restored database (e.g. Domain\SSOUser).
  3. Launch IIS
  4. Create an application pool that works with .NET 4 and uses “Integrated” mode. Name this Application Pool as SSO.
  5. Set the identity account of the newly created application pool. This account must be equal to the Windows account that you added to SQL Server (e.g. Domain\SSOUser). This is required because the existing custom federation project is using Windows Authentication to access database.
  6. Open the solution file in Visual Studio 2010
  7. Publish SignInServices application to a folder
  8. Go to the folder
  9. Open web.config file and configure the following entries:
    1. Under <appSettings> set SessionTimeOutMinutes. This value indicates that how long a sign-in ticket is valid.
    2. Under <appSettings> set federationType to any other federation type that you may want to use. If you want to use the custom federation type shipped with this sample, leave the current value as is.
    3. Under <connectionStrings> update the connection strings either if you have restored the database under any name other than “Framework” or you want to use SQL Server authentication rather Windows authentication.
  10. Build SignInService.CustomFederation project and copy the .DLL file to the \BIN folder of SignInService
  11. Build SSOClientServices project and copy the .DLL file to the \BIN folder of SignInService
  12. Go back to IIS
  13. Under IIS create a new Web Application that uses SSO as its Application Pool and points to the folder to which you published the SignInService application.
  14. Publish TestWebSite to IIS or simply run it in VS 2010.

 

Configuring the custom federation

The custom federation uses SMTP to send a new password to users once a password is requested to reset. The configuration of SMTP server is in Web.Config file of the WCF application (SignInSerivice). You must configure SMTP in order to reset passwords.

<system.net>

<mailSettings>

<smtp
deliveryMethod=Network
from=name@domain.com>

<network
host=smtp.mail.com
userName=name@domain.com
password=password of sender
port=25 />

</smtp>

</mailSettings>

</system.net>

 

Important notice: The solution file uses a custom database to store the user information. To add new users, simply install the SignInService under IIS and then navigate to http://service-url/Admin/ManageUsers.aspx for example if the SSO WCF service is deployed to http://127.0.0.1 /SSO/SSO.SVC, you may access the user management page via http://127.0.0.1/SSO/SSO.SVC/Admin/ManageUsers.aspx

(the admin page must be completed as I focused on developing the SSO service rather than implementing the admin web site)

Important notice: None of the applications or the WCF service is protected for simplicity purposes. If you deploy this solution to a production environment, you must protect them using ASP.NET authentication or WCF security practices.

Important notice: Once you launch the TestWebSite, you’ll see a login screen. Enter the following default credential to enter:

User: aref.karimi

Password: 123

 

 

What is a Software Architecture Document and how would you build it?


Howdy!

After working as a senior designer and/or a software architect in three sub-continents, I came across to a kind of phenomenon in Australia! I call it a phenomenon because first of all, terms such as ‘solution architect’, ‘software architect’ and/or ‘enterprise architect’ are used interchangeably and sometimes incorrectly. Second, architecture is often ignored and contractors or consultants usually start doing the detailed design as soon as they receive a requirements document.

This leaves the client (the owner of the project) with a whole bunch of documents which are not understandable to them so that they have to hand them over to a development team without even knowing if the design is what they really wanted.

This happens because such a crucial role is assigned to a senior developer or a designer who thinks purely technical whilst an Architect must be able to look at the problem from different aspects (This happens because in Australia titles are given away for free, just ask for one!).

What is Architecture?

Architecture is the fundamental organization of a system embodied in its components, their relationships to each other, and to the environment, and the principles guiding its design and evolution (IEEE 1471).

The definition suggested by IEEE (above) refers to a solution architect and/or software architect. However, as Microsoft suggests there are other kinds of architects such as a Business Strategy Architect.

There are basically six types of Architects:

·        Business Strategy Architect

The purpose of this role is to change business focus and define the enterprise’s to-be status. This role, he says, is about the long view and about forecasting.

·        Business Architect

The mission of business architects is to improve the functionality of the business. Their job isn’t to architect software but to architect the business itself and the way it is run.

·        Solution Architect

Solution architect is a relatively new term, and it should refer also to an equally new concept. Sometimes, however, it doesn’t; it tends to be used as a synonym for application architect.

·        Software Architect

Software architecture is about architecting software meant to support, automate, or even totally change the business and the business architecture.

·        Infrastructure Architect

The technical infrastructure exists for deployment of the solutions of the solution architect, which means that the solution architect and the technical infrastructure architect should work together to ensure safe and productive deployment and operation of the system

·        Enterprise Architect

Enterprise Architecture is the practice of applying a comprehensive and rigorous method for describing a current and/or future structure and behaviour for an organization’s processes, information systems, personnel and organizational subunits, so that they align with the organization’s core goals and strategic direction. Although often associated strictly with information technology, it relates more broadly to the practice of business optimization in that it addresses business architecture, performance management, and process architecture as well (Wikipedia).

Solution Architect

As we are techies let’s focus on Solution Architect role:

It tends to be used as a synonym for application architect. In an application-centric world, each application is created to solve a specific business problem, or a specific set of business problems. All parts of the application are tightly knit together, each integral part being owned by the application. An application architect designs the structure of the entire application, a job that’s rather technical in nature. Typically, the application architect doesn’t create, or even help create, the application requirements; other people, often called business analysts, less technically and more business-oriented than the typical application architect, do that.

So if you are asked to get on board and architecture a system based on a whole bunch of requirements, you are very likely to be asked to do solution architecture.

How to do that?

A while back a person who does not have a technical background, but he has money so he is the boss, was lecturing that in an ideal world no team member has to talk to other team members. At that time I was thinking that in my ideal world, which is very close to the Agile world, everybody can (or should) speak to everybody else. This points out that how you architecture a system is strongly tight to your methodology. It does not really make a big difference that which methodology you follow as long as you stick to the correct concepts. Likewise, he was saying that the Software Architecture Document is part of the BRD (Business Requirement Document) as if it was technical a business person (e.g. the stake holders) would not understand it. And I was thinking to me that: mate! There are different views being analyzed in a SAD. Some of them are technical, some of them are not.

 What the above story points out to me is that solution architecture is the art of mapping the business stuff to technical stuff, or in the other words, it’s actually speaking about technical things in a language which is understandable to business people.

A very good way to do this is to putting yourself in the stakeholders’ shoes. There are several types of stakeholders in each project who have their own views and their own concerns. This is the biggest difference between the design and the architecture. A designer thinks very technically while an architect can think broadly and can look at a problem from different views. Designers usually make a huge mistake, which happens a lot in Australia: They put everything in one document. Where I am doing a solution architecture job now, I was given a 21-mega-byte MS Word document which included everything, from requirements to detailed class and database design. Such a document is very unlikely to be understandable by the stakeholders and very hard to use by developers. I reckon that this happens because firstly designers don’t consider the separation of stake holders and developers concerns. Second, because it’s easier to write down everything in a document. But I have to say that this is wrong as SAD and design document (e.g. TSD) are built for different purposes and for different audiences (and in different phases if you are following a phase-based methodology such as RUP). If you put everything in a document, it’s like you are cooking dinner and you put the ingredients along with the utensils in a pot and boil them!!

A very good approach for looking at the problem from the stakeholder’s point of view is the 4+1 approach. At this model, scenarios (or Use Cases) are the base and we look at them from a logical view (what are the building blocks of the system), Process view (processes such as asynchronous operations), Development (aka Implementation) view and Physical (aka Deployment) view. There are also optional views such as Data View that you can use if you need to. Some of the views are technical and some of them are not, however they must match and there must be a consistency in the architecture so that technical views can cover business views (e.g. demonstration of a business process with a UML Activity Diagram and/or State Diagram).

I believe that each software project is like a spectrum that each stakeholder sees a limited part of it. The role of an architect is to see the entire spectrum. A good approach to do so (that I use a lot) is to include a business vision (this might not be a good term) in your SAD. It can be a billeted list, a diagram or both, which shows what the application looks like from a business perspective. Label each part of the business vision with a letter or a number. Then add an architectural overview and then map it to the items of business vision indicating that which part of the architecture is meant to address which part of the business vision.

In a nutshell, Architecture is early design decisions, it is not the design.

What to put in an SAD?

There are a whole bunch of SAD templates on the internet, such as the template offered by RUP. However the following items seem to be necessary for each architecture document:

  • Introduction. This can include Purpose, Glossary, Background of the project, Assumptions, References etc. I personally suggest that you explain that what kind of methodology you are following? This will avoid lots of debates, I promise!

It is very important to clear the scope of the document. Without a clear scope not only you will never know that when you are finished, you won’t be able to convince the stakeholder that the architecture is comprehensive enough and addresses all their needs.

  • Architectural goals and constraints: This can include the goals, as well as your business and architectural visions. Also explain the constraints (e.g. if the business has decided to develop the software system with Microsoft .NET, it is a constraint). I would suggest that you mention the components (or modules) of the system when you mention your architectural vision. For example say that it will include Identity Management, Reporting etc. And explain what your strategy to address them is. As this section is intended to help the business people to understand your architecture, try to include clear and well-organised diagrams.

A very important item that you want to mention is the architectural principles that you are following. This is even more important when the client organization maintains a set of architectural principles.

    • Quality of service requirements: Quality of service requirements address the quality attributes of the system, such as performance, scalability, security etc. These items must not be mentioned in a technical language and must not contain any details (e.g. the use of Microsoft Enterprise Library 5).
    • Use Case View: Views basically come from 4+1 model so if you follow a different model you might not have it. However, it is very important that you detect key scenarios (or Use Cases) and mention them in a high-level. Again, diagrams, such as Use Case Diagram, help.
    • Logical View: Logical view demonstrates the logical decomposition of the system, such as packages the build it. It will help the business people and the designers to understand the system better.
    • Process View: Use activity diagrams as well as state diagrams (if necessary) to explain the key processes of the system (e.g. the process of approving a leave request).
    • Deployment View: Deployment view demonstrates that how the system will work in a real production environment. I suggest that you put 2 types of diagrams: one (normal) human understandable diagram, such a Visio Diagram that shows the network, firewall, application server, database, etc.  Also a UML deployment diagram that demonstrates the nodes and dependencies. This will again helps the business and technical people have same understanding of the physical structure of the system.
    • Implementation View: This part is the most interesting section of the techies. I like to include the implementation options (e.g. Java and .NET) and provide a list of pros and cons for each of them. Again, technical pros and cons don’t make much sense to business people. They are mostly interested in Cost of Ownership and availability of the resources and so on.  If you suggest a technology or if it has already been selected, list the products and services that are needed on a production environment (e.g. IIS 7, SQL Server 2008).  Also it’ll be good to include a very high-level diagram of the system.

Also I like to explain the architectural patterns that I’m going to use. If you are including this section in the Implementation View, explain them enough so that a business person can quite understand what that pattern is for. For instance if you are using Lazy Loading patter, explain that what problem does it solve and why you are using it.

Needless to say that you have to also decide which kind of Architecture style you are suggesting, such as 3-Tier and N-Tier, Client-Server etc. Once you have declared that, explain the components of the system (Layers, Tiers and their relationships) by diagrams.

This part also must include your implementation strategy for addressing the Quality of Service Requirements, such as how will you address scaling out.

  • Data View: If the application is data centric, explain the overall solution of data management (never put a database design in this part), your backup and restore strategy as well as disaster recovery strategy.

Be iterative

It is suggested that the architecture (and in result the Software Architecture Document) be developed through two or more iterations. It’s impossible to build a comprehensive architecture document in one iteration as not only Architecture has an impact on the requirements, but also architecture  begins in an early stage and many of the scenarios are likely to change.

How to prove that?

Now that after doing lots of endeavor you have prepared your SAD, how will you prove it to the stakeholders? I assume that many of business people do not have any idea about the content and structure of an SAD and the amount of information that you must include in it.

A good approach is to prepare a presentation about the mission of the system, scope, goals, visions and your approach. Invite the stakeholders to a meeting and present the architecture to them and explain that how the architecture covers their business needs. If they are not satisfied, your architecture is very likely to be incomplete.

References

ASP.NET MVC 3 Sample Application


Here is an ASP.NET MVC 3 sample web site to let ASP.NET MVC learners see how a real application is developed and works. This will be the first version of the app so more versions are lined up to emerge.

This ASP.NET MVC application is intended to be as simple as possible so you have to know that it is not such a comprehensive commercial application. The database, user interface and code snippets are tend to be concise and clear. Next versions will cover the utilization of Ajax and Asp.NET MVC UI controls.

What is this application about?

WebAdvert is an online advertisement web site. Basically users can sign up and issue their own adverts. They also will be able to manage (view/edit/delete) the ads. Anonymous users can browse the existing ads. Finally, administrators can view/create/delete the members and also assign them to the “Admins” role if necessary.

WebAdvert uses ASP.NET Forms authentication in an MVC fashion. This means that the application includes the AspnetDB database. It also has a SQL Server Express database file named WebAdvert. WebAdvert database contains one table only which is named “Adverts”. Ads are stored in Adverts table. The structure of this table is as bellow:

Figure 1: Structure of Adverts table

Use the following credentials to login:

User name: admin

Password: password)_

 

Figure 2 the browsing page

 

Prerequisites

  • Visual Studio 2010
  • Visual C# Express Edition 2010
  • Visual Web Developer 2010
  • ASP.NET MVC 3
  • Entity Framework 4
  • SQL Server 2008 Express Edition

 

Download

Download WebAdvert ASP.NET MVC 3.0 source code from webadvert.codeplex.com