August 26, 2008

SQL Injection

What is SQL Injection?
SQL Injection is the number one exploit used by hackers to steal information and deface websites. SQL Injection is based on malicious altering of SQL statements from their intended format, by exploiting weaknesses of a client application that is used to create them. Most commonly, this takes place in Internet or intranet scenarios, where users are expected to enter data via text boxes presented through a Web-page interface, which, in turn is used to populate a dynamically formed query. The most classic case of such attacks (although certainly not the only one) involves a design, in which access to a database is controlled by the username and password pair stored in one of its tables. In the case of a match, a user is automatically redirected to another Web page, from which other SQL statements can be run.
Using SQL Injection, attackers can:
1. Change or delete website content
2. Steal user information, such as email addresses,
username & password, credit card details
3. Access databases connected to the website

An SQL Injection attack means that hackers are injecting SQL commands into form fields or URL parameters, which may be executed by the database of unprotected web servers.

SQL Injection example :
The dynamic SQL will put system at risk for SQL injection attacks, in which an attacker compromises the system's security by executing unauthorized code.
The following example show how a user exploits the dynamic SQL in a stored procedure. The Web page in this example lets the user specify searches for users based on their name. The Web page executes the following stored procedure, SearchUser:
CREATE TABLE [dbo].[Users](
[UserId] [int] NULL,
[Name] [varchar](50) COLLATE Latin1_General_CI_AS NULL
) ON [PRIMARY]

CREATE PROCEDURE SearchUser
(@name varchar(100))
AS
BEGIN
DECLARE @SQL nvarchar(4000), @WhereClause nvarchar(500)
SET @SQL = N'SELECT * FROM dbo.Users'
SET @WhereClause = N' WHERE 1=1'
IF @name IS NOT NULL
SET @WhereClause = @WhereClause + ' AND name =
''' + @name + ''''
EXEC(@SQL + @WhereClause)
END
GO

The parameters @user let users conduct wildcard searches by using LIKE patterns.
Using the SearchUser stored procedure, webuser issues the following command to return the generated queries and a list of users from the users table, which demonstrates the security danger of forming dynamic SQL without specific checks:
exec SearchUser 'avi';SELECT * FROM Users;
By injecting SQL code into the search parameters, an attacker can potentially perform unauthorized actions depending on the permissions of the user account, the Web page, or application executing the stored procedure.

How to secure SQL Server databases from SQL injection:
There are a few ways to protect your database against these kinds of attacks. First we need to lock down the database security using database security best practices. This involves setting up the database security with the lowest set of permissions possible. It also includes not using any table-level access to the tables. All access to the tables should be done through stored procedures, and those stored procedures should not include any dynamic SQL.
Without securing the application layer against SQL injection attacks, all bets are off. The data, as it comes into the database, is basically impossible to validate within the database. It needs to be validated at the application layer.
If you don't validate the data within the v_Input variable, then you leave yourself open to SQL injection attacks. If you don't validate the input, it allows the attacker to pass in a single quote, and a semicolon, which tells the SQL Server to end the value and the statement moving on to the next statement in the batch. You should do some basic validation and replace any single quotes within our variable with two single quotes.


Measures to avoid SQL injection :
Validate all input coming from the user on the server.
Avoid the use of dynamic SQL queries if there an alternate method is available.
Use parameterized stored procedure with embedded parameters. it is recommended you use parameterized queriesand filter all user input for non-alphanumeric characters.
Execute stored procedures using a safe interface such as Callable statements in JDBC or CommandObject in ADO.
Use a low privileged account to run the database.
Give proper roles and privileges to the stored procedure being used in the applications.

The most systematic method to do so is to set coding standards that require this to be done. If the code is
already written, a code review should be done to detect any vulnerabilities. It is also recommended you
look at some of the automated tools available for detecting these types of problems.
Even if you feel you have closed all known vulnerabilities, it is still in your best interest to prevent these
specific attacks by disabling some of SQL Server’s functionality. This is not practical if you are actually
using the functionality. Fortunately, the functionality we are looking to disable is not used often.

August 19, 2008

Pivots – Converting Rows to Columns SQL Server 2005

Sometimes it is necessary to rotate results so that [the data in] columns are presented horizontally and [the data in] rows are presented vertically. This is known as creating a PivotTable, creating a cross-tab report, or rotating data.
Lets create table to use Pivot function.


CREATE TABLE TestPivot
( EmpID INT,
SalaryComponent INT,
SalaryAmount INT
)
go
INSERT INTO TestPivot
SELECT 1234, 1, 1000
UNION ALL
SELECT 1234, 2, 20
UNION ALL
SELECT 1234, 3, 4500
UNION ALL
SELECT 123, 1, 2000
UNION ALL
SELECT 123, 2, 30
UNION ALL
SELECT 123, 3, 4500
UNION ALL
SELECT 234, 1, 5000
UNION ALL
SELECT 234, 3, 4500
go

We need to show data like as follows:

EmpId Comp1 Comp2 Comp3 Comp4 Salary
123 2000 30 4500 0 6530
234 5000 0 4500 0 9500
1234 1000 20 4500 0 5520


This can be done using query:

SELECT EmpID,
SUM(CASE WHEN SalaryComponent = 1 THEN SalaryAmount ELSE 0 END) AS [Comp1],
SUM(CASE WHEN SalaryComponent = 2 THEN SalaryAmount ELSE 0 END) AS [Comp2],
SUM(CASE WHEN SalaryComponent = 3 THEN SalaryAmount ELSE 0 END) AS [Comp3],
SUM(CASE WHEN SalaryComponent = 4 THEN SalaryAmount ELSE 0 END) AS [Comp4],
SUM(SalaryAmount) AS Salary
FROM TestPivot
GROUP BY EmpID

let's see how to use PIVOT to do the same thing...

SELECT EmpID,
COALESCE([1],0) AS [Comp1],
COALESCE([2],0) AS [Comp2],
COALESCE([3],0) AS [Comp3],
COALESCE([4],0) AS [Comp4],
COALESCE([1],0)+COALESCE([2],0)+COALESCE([3],0)+COALESCE([4],0) as Salary
FROM
(
SELECT EmpID,SalaryComponent,SalaryAmount
FROM TestPivot
) as T
PIVOT (SUM(SalaryAmount) FOR SalaryComponent IN ([1],[2],[3],[4])) AS P
ORDER BY EmpID

PIVOT identifies the aggregate to be used, the column to pivot in the FOR clause, and the list of values that we want to pivot in the IN clause... in this case, the quarter number. Also notice that you must treat those as if they were column names. They must either be put in brackets or double quotes (if the quoted identifier setting is ON).
You must also bring EmpIDup as the row identifier in the pivot. Think of this as your "anchor" for the rows.

If you want a total for each row in the pivot, you can no longer use just an aggregate. Instead, you must add all the columns together.

Notice the NULL's where there are no values or where a NULL has been added into a total. Remember that anything plus a NULL is still a NULL. All of this occurs because the Pivot doesn't do any substitutions like the Case statements. To fix this little problem, we have to use COALESCE (or ISNULL) on the columns

August 07, 2008

Monitoring server disk space in SQL Server

DBAs must monitor their SQL Servers carefully to ensure that the system and user databases have sufficient disk space for the life time of the applications using these SQL Servers. Failure to do so can result in the following problems:
All SQL Server jobs fail – if the msdb database does not have room to grow, job information will be unable to be logged and all of your SQL Server jobs will be unable to start.
Sorts, aggregates and operations which make heavy use of tempdb may fail if tempdb data or log files can't grow to accommodate these operations.
The transaction log may fill up and all DML operations on your database may fail and point in time recovery is lost.
Your database files themselves may fill up and all DML operations on your database may fail.
A database experiencing out of control growth of its transaction log or data files may fill up the entire disk causing other databases applications to fail.

Steps for monitoring disk space in SQL Server :

Auto-growth settings for logs and files
The best practice is to size your data and log files to sizes that minimize the number of auto growths. For data files, the best practice is not to use the Growth in Percent, but rather the Growth in Megabytes, and select a value that will work for your database.

File growth is an intensive disk I/O operation and if SQL Server has to wait for a data or log file to expand, you could certainly see this harm performance and response times. The default growth increment is 1MB for data files and 10 percent for log files, which can be insufficient growth increments for busy systems. Additionally, relying on autogrow can lead to disk fragmentation because the data or log files are not contiguous on disk—this means response times may be longer than necessary, as data is physically scattered on the disk.
To size your transaction log files you should:
Shrink the log as much as possible.
Dump every 10-20 minutes.
Monitor its growth over a week - noting the maximum size.
Lastly, dump the log, shrink it one more time to the minimum size and then manually size it for this observed maximum size.
This prevents too many virtual log files, which can lead to performance problems.

Check server disk space :
Xp_fixeddrives will give the amount of free space available on local fixed drives.

Check database files space :

select db_name(dbid),name,drive=left(filename,1),filename,
filegroup=filegroup_name(groupid),
'size'=convert(nvarchar(15),convert(bigint,size)*8) +N'KB',
'maxsize'=(case maxsize when -1 then N'Unlimited'
else convert(nvarchar(15),convert(bigint,maxsize)*8) +N'KB'
end),
'usage'=(case status & 0x40 when 0x40 then 'log only' else 'data only' end)
from sysaltfiles
where dbid= 7
order by usage
Monitor using performance monitor :
You can only use performance monitor to monitor log auto growths – there is no counter for data file growths. Use Performance Monitor to monitor the performance objects SQL Server Databases: Log Growths, Percent log Used, and Data File Size.
Performance Monitor allows you to watch specific databases or all databases, and if necessary, raise alerts to send net messages or write events to the Windows NT application log, which monitoring software (like NetIQ) can then pick up and react to.

August 04, 2008

Blocking and Deadlocking

Blocking
Blocking is a necessary side effect of using locks to control concurrent resource access. A lock is either compatible or incompatible with other locks depending on the lock mode. Before a lock is granted, SQL Server first checks to see if an incompatible lock on the same resource exists. If not, the lock request is granted and execution proceeds. When the requested lock mode is incompatible with an existing lock mode on the same resource, the requesting session is blocked and waits until:
1) The blocking session releases the lock
2) The LOCK_TIMEOUT threshold of the blocked session is reached (indefinite by default)
3) The blocked client cancels the query
4) The blocked client times out (30-second default in ADO/ADO.NET, technically the same as #3)

Deadlocking
A deadlock is basically a special blocking scenario where 2 sessions are waiting on each other (directly or indirectly). Neither can proceed so both will wait indefinitely unless a timeout or intervention occurs. Unlike a normal blocking scenario, SQL Server will intervene when a deadlock situation is detected and cancel one of the transactions involved. The locks of the cancelled transaction are then released so the other blocked session can proceed. SQL Server chooses the transaction that is the least expensive to rollback as the deadlock victim by default. If set deadlock priority has been issued, SQL Server chooses the one with the lowest priority as the victim.
A deadlock always starts as a normal block with one session waiting while the other continues. It is only when the running session is later blocked by the waiting session that the deadlock occurs.

You can view session and lock info during this blocking episode by executing sp_who2 and sp_lock from a new SSMS window or you can use the Activity Monitor from the SSMS Object Explorer GUI.

How To Calculate the Number of Week Days Between two Dates

How To Calculate the Number of Week Days Between two Dates
declare @StartDay datetime
declare @EndDay datetime
set @StartDay = getdate()-30
set @EndDay = getdate()
select DateDiff(dd, @StartDay, @EndDay) - DateDiff(ww, @StartDay, @EndDay)*2
If you have a table of holidays, then you can simply subtract them out as well:
declare @StartDay datetime
declare @EndDay datetime
set @StartDay = getdate()-30
set @EndDay = getdate()
select DateDiff(dd, @StartDay, @EndDay) -
DateDiff(ww, @StartDay, @EndDay)*2 -
(select count(*) from holidays where holiday_date between @StartDay and @EndDay)

Creating DataFrames from CSV in Apache Spark

 from pyspark.sql import SparkSession spark = SparkSession.builder.appName("CSV Example").getOrCreate() sc = spark.sparkContext Sp...