IT Business Lab: March 2013

Saturday, 30 March 2013

IT Business Lab Class 10

IT Business Lab ; Assignment 10 3D Plotting

Assignment 1:

To Create 3 vectors, x, y, z and choose any random values for them, bind them together and to create 3 dimensional plots of the same.

Commands:

First creating a random data set of 50 items with mean =30 and standard deviation =10

> data <- rnorm(50,mean=30,sd=10)
> data

Taking sample data of length 10 from the created data set in three different vectors x,y,z
> x <- sample(data,10)
> x

> y <- sample(data,10)
> y

> z <- sample(data,10)
> z

Binding the three vectors x,y,z into a vector T using cbind
> T <- cbind(x,y,z)
> T

Data Set

Plotting 3d graph

Command:

> plot3d(T[,1:3])

3D plot

Plotting of graph with labels for axes and color

Command

> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(500))

3D plot with color

Plotting of graph with labels for axes, color and type = spheres

Command

> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='s')

3D Plot with spheres

Plotting of graph with labels for axes, color and type = points

Command

> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='p')

3D Plot with Points

Plotting of graph with labels for axes, color and type = lines

Command

> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='l')

3D Plot with Lines

Assignment 2:

Choose 2 random variables

Create 3 plots:

1. X-Y

2. X-Y|Z (introducing a variable z and cbind it to z and y with 5 diff categories)

3. Color code and draw the graph

4. Smooth and best fit line for the curve

Commands

> x <- rnorm(5000, mean= 20 , sd=10)

> y <- rnorm(5000, mean= 10, sd=10)

> z1 <- sample(letters, 5)

> z2 <- sample(z1, 5000, replace=TRUE)

> z <- as.factor(z2)

> z

Data Set

Creating Quick Plots

Command:

>qplot(x,y)

x and y qplot

>qplot(x,z)

x and z qplot

For semi-transparent plot

> qplot(x,z, alpha=I(2/10))

Semi-transparent Plot

For coloured plot

> qplot(x,y, color=z)

Colored plot

For Logarithmic coloured plot

> qplot(log(x),log(y), color=z)

Logarithmic Plot

Best Fit and Smooth curve using "geom"

Command:

> qplot(x,y,geom=c("path","smooth"))

geom='path'

> qplot(x,y,geom=c("point","smooth"))

geom='point'

> qplot(x,y,geom=c("boxplot","jitter"))

geom='boxplot' and 'jitter'

Sunday, 24 March 2013

IT Business Lab Class 9

QlikView as Visualization tool:

An infographics/data visualization tool that I have studied and found highly sophisticated yet user-friendly is QlikView.

Features:

This is one of the most practiced data visualization tool which enables the user to

Consolidating relevant data from multiple sources into a single application
Exploring the associations in the data
Enabling social decision making through secure, real-time collaboration
Visualizing data with engaging, state-of-the-art graphics
Searching across all data—directly and indirectly
Interacting with dynamic apps, dashboards and analytics
Accessing, analyzing and capturing data from mobile devices

The QlikView Difference over others

Has an inference engine that maintains the associations in the data automatically
Calculates aggregations on the fly, as needed, for a super-fast user experience
Compresses data down to 10% of its original size to optimize the power of the processors
Accomplishes both within a single, comprehensive product

Go to http://ap.demo.qlikview.com/download/.

Install the application with valid credential.

The home screen looks like:

Choose any supported file.

I have chose an excel containing few NIFTY historical data as follows:

Date	Open	High	Low	Close	Shares Traded	Turnover (Rs. Cr)
1-Oct-12	5704.75	5722.95	5694	5718.8	123138510	4798.17
3-Oct-12	5727.7	5743.25	5715.8	5731.25	165037864	6654.02
4-Oct-12	5751.55	5807.25	5751.35	5787.6	171404290	6954.74
5-Oct-12	5815	5815.35	4888.2	5746.95	255569804	12995.8
8-Oct-12	5751.85	5751.85	5666.2	5676	142319000	5853.56
9-Oct-12	5708.15	5728.65	5677.9	5704.6	119300415	5047.01
10-Oct-12	5671.15	5686.5	5647.05	5652.15	126294361	4564.39

After loading the data there are several types of visualization options avalible like

Bar chart

Line chart

Combo chart

Scatter chart

Grid chart

Straight Table

Pivot Table

I made use of some of the above mentioned charts to came out some observations:

Fig 1:

Fig 2:

Fig 3:

Summary:

Qlikview works perfectly when the size of the database is small but in practical cases the database is never small.
Alerts- Capability to create alerts and delivers it to not only Email but blackberries, hand held devices, mobile phones etc
Multiuser development environment- This feature allows multiple developers work on a single project and the utility synchronizes the pieces of project each developer is working with the main project. Qlikview completely lacks this feature.
Connect and extract data from multidimensional objects.
Support for advance features like embedded browser(available in Hyperion Interactive reporting), flickers(rolling messages) etc as an standard options.

Friday, 15 March 2013

IT Business Lab Class 8

IT Business Lab Class 8 - Panel Data Analysis

To carry out Panel Data Analysis of "Produc" data and to analyze the 3 types of model :
1 Pooled affect model
2 Fixed affect model
3 Random affect model

To determine most efficient model by using following functions:
pFtest : Fixed vs Pooled
plmtest : Pooled vs Random
phtest: Random vs Fixed

Pooled Model

pool<-plm( log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)

, data= Produc, model = ("pooling"), index = c("state","year"))

Fixed Model

fixed<-plm( log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)

, data= Produc, model = ("within"), index = c("state","year"))

Random Model

random<-plm( log(pcap) ~ log(hwy) + log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)

, data= Produc, model = ("random"), index = c("state","year"))

Pooled vs Fixed

Null Hypothesis: Pooled Model is efficient than fixed model

Alternate Hypothesis : Pooled model is not efficient than the fixed model

Since the p value is very small, we reject the Null Hypothesis. And so alternate hypothesis is accepted which is to accept Fixed Model is better than Pooled Model

Pooled vs Random

Null Hypothesis: Pooled Model is efficient than Random model

Alternate Hypothesis: Random Model is efficient than Pooled model

Since the p value is very small, we reject the Null Hypothesis. And so alternate hypothesis is accepted which is to accept Random Model is better than Pooled Model

Random vs Fixed

Null Hypothesis: No Correlation . Random Model

Alternate Hypothesis: Fixed Model

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Model.

Conclusion:

The analysis proves that Fixed Model is the best suited to do the panel data analysis for "Produc" data set. And also we observe that within the same id the variation is null.