February 15, 2013

Scaling Out SQL Server

Scalability is the ability of an application to efficiently use more resources in order to do more useful work. For example, an application that can service four users on a single-processor system may be able to service 15 users on a four-processor system. In this case, the application is scalable. If adding more processors doesn't increase the number of users serviced (if the application is single threaded, for example), the application isn't scalable.

There are two kinds of scalability: scaleup and scaleout.
Scaleup means scaling to a bigger, more powerful server—going from a four-processor server to a 64-processor or 128-processor server, for example. This is the most common way for databases to scale. When your database runs out of resources on your current hardware, you go out and buy a bigger box with more processors and more memory. Scaleup has the advantage of not requiring significant changes to the database. In general, you just install your database on a bigger box and keep running the way you always have, with more database power to handle a heavier load.  

Scaleout means expanding to multiple servers rather than a single, bigger server. Scaleout usually has some initial hardware cost advantages—eight four-processor servers generally cost less than one 32-processor server. Scaleout  is separating or partitioning the database system in a manner so you can take those parts and place them on separate database servers. This allows you to spread processing power across as many servers as necessary to accommodate expanding growth. However, additional features and functionality require additional complexity. A scale out database scenario is not a particularly easy one to design or administer. You must answer many difficult business and technology-driven questions before you can successfully implement a scale out of a database system.

 There is no thumb rule for Scaleup  and Scaleout.. i.e. if hardware cost is less than licensing and maintenance costs then Scaleup is better than Scaleout.If one machine out of your N machine fails, it's less important. The system will still be up and running. And, it's not only failures but hardware/OS/software updates/upgrades, then Scaleout is better than Scaleup

No comments:

Creating DataFrames from CSV in Apache Spark

 from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CSV Example").getOrCreate() sc = spark.sparkContext Sp...