Saturday, 30 March 2013

IT Business Lab Class 10

IT Business Lab ; Assignment 10        3D Plotting



Assignment 1: 

To Create 3 vectors, x, y, z and choose any random values for them, bind them together and to create 3 dimensional plots of the same.


Commands:

First creating a random data set of 50 items with mean =30 and standard deviation =10

> data <- rnorm(50,mean=30,sd=10)
> data

Taking sample data of length 10 from the created data set in three different vectors x,y,z
> x <- sample(data,10)
> x

> y <- sample(data,10)
> y

> z <- sample(data,10)
> z

Binding the three vectors x,y,z into a vector T using cbind
> T <- cbind(x,y,z)
> T
          


Data Set

Plotting 3d graph 

Command:

> plot3d(T[,1:3])
3D plot

Plotting of graph with labels for axes and color

Command 
> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(500))
3D plot with color

Plotting of graph with labels for axes, color and type = spheres

Command
> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='s')
3D Plot with spheres


Plotting of graph with labels for axes, color and type = points

Command

> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='p')
3D Plot with Points


Plotting of graph with labels for axes, color and type = lines

Command

> plot3d(T[,1:3], xlab="X Axis" , ylab="Y Axis" , zlab="Z Axis", col=rainbow(5000), type='l')
3D Plot with Lines



Assignment 2:

Choose 2 random variables 
Create 3 plots: 
1. X-Y 
2. X-Y|Z (introducing a variable z and cbind it to z and y with 5 diff categories)
3. Color code and draw the graph 
4. Smooth and best fit line for the curve

Commands

> x <- rnorm(5000, mean= 20 , sd=10)
> y <- rnorm(5000, mean= 10, sd=10)
> z1 <- sample(letters, 5)
> z2 <- sample(z1, 5000, replace=TRUE)
> z <- as.factor(z2)
> z
Data Set

Creating Quick Plots

Command:

>qplot(x,y)
x and y qplot

>qplot(x,z)
x and z qplot

For semi-transparent plot

> qplot(x,z, alpha=I(2/10))
Semi-transparent Plot


For coloured plot

> qplot(x,y, color=z)

Colored plot


For Logarithmic coloured plot

> qplot(log(x),log(y), color=z)
Logarithmic Plot

Best Fit and Smooth curve using "geom"

Command:

> qplot(x,y,geom=c("path","smooth"))

geom='path'

> qplot(x,y,geom=c("point","smooth"))

geom='point'


> qplot(x,y,geom=c("boxplot","jitter"))
geom='boxplot' and 'jitter'

Sunday, 24 March 2013

IT Business Lab Class 9

IT Business Lab Class 9
 
 QlikView as Visualization tool:

 
An infographics/data visualization tool that I have studied and found highly sophisticated yet user-friendly is QlikView.

 
Features:
 



This is one of the most practiced data visualization tool which enables the user to
  • Consolidating relevant data from multiple sources into a single application
  • Exploring the associations in the data
  • Enabling social decision making through secure, real-time collaboration
  • Visualizing data with engaging, state-of-the-art graphics
  • Searching across all data—directly and indirectly
  • Interacting with dynamic apps, dashboards and analytics
  • Accessing, analyzing and capturing data from mobile devices

The QlikView Difference over others
  • Has an inference engine that maintains the associations in the data automatically
  • Calculates aggregations on the fly, as needed, for a super-fast user experience
  • Compresses data down to 10% of its original size to optimize the power of the processors
  • Accomplishes both within a single, comprehensive product

Go to http://ap.demo.qlikview.com/download/.

Install the application with valid credential.

The home screen looks like:
 




 
Choose any supported file.

I have chose an excel containing few NIFTY historical data as follows:

Date Open High Low Close Shares Traded Turnover (Rs. Cr)
1-Oct-12 5704.75 5722.95 5694 5718.8 123138510 4798.17
3-Oct-12 5727.7 5743.25 5715.8 5731.25 165037864 6654.02
4-Oct-12 5751.55 5807.25 5751.35 5787.6 171404290 6954.74
5-Oct-12 5815 5815.35 4888.2 5746.95 255569804 12995.8
8-Oct-12 5751.85 5751.85 5666.2 5676 142319000 5853.56
9-Oct-12 5708.15 5728.65 5677.9 5704.6 119300415 5047.01
10-Oct-12 5671.15 5686.5 5647.05 5652.15 126294361 4564.39

After loading the data there are several types of visualization options avalible like
Bar chart
Line chart
Combo chart
Scatter chart
Grid chart
Straight Table
Pivot Table

I made use of some of the above mentioned charts to came out some observations:

Fig 1:






 
Fig 2:



Fig 3:
 
 
 
 
 




Summary:
  • Qlikview works perfectly when the size of the database is small but in practical cases the database is never small.
  • Alerts- Capability to create alerts and delivers it to not only Email but blackberries, hand held devices, mobile phones etc
  • Multiuser development environment- This feature allows multiple developers work on a single project and the utility synchronizes the pieces of project each developer is working with the main project. Qlikview completely lacks this feature.
  • Connect and extract data from multidimensional objects.
  • Support for advance features like embedded browser(available in Hyperion Interactive reporting), flickers(rolling messages) etc as an standard options.

Friday, 15 March 2013

IT Business Lab Class 8



IT Business Lab Class 8 - Panel Data Analysis
  To carry out Panel Data Analysis of "Produc" data and to analyze the 3 types of model :
      1  Pooled affect model
      2 Fixed affect model
      3 Random affect model

To determine most efficient model by using following functions:
       pFtest : Fixed vs Pooled
       plmtest : Pooled vs Random
       phtest: Random vs Fixed


  Pooled Model

pool<-plm( log(pcap) ~ log(hwy) +  log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
, data= Produc, model = ("pooling"), index = c("state","year"))

 Fixed Model

fixed<-plm( log(pcap) ~ log(hwy) +  log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
, data= Produc, model = ("within"), index = c("state","year"))

 Random Model

random<-plm( log(pcap) ~ log(hwy) +  log(water) + log(util) + log(pc) + log(gsp) + log(emp) + log(unemp)
, data= Produc, model = ("random"), index = c("state","year"))

Pooled vs Fixed 

Null Hypothesis: Pooled Model is efficient than fixed model
Alternate Hypothesis : Pooled model is not efficient than the fixed model


Since the p value is very small, we reject the Null Hypothesis. And so alternate hypothesis is accepted which is to accept Fixed Model is better than Pooled Model
Pooled vs Random 

Null Hypothesis: Pooled Model is efficient than Random model
Alternate Hypothesis: Random Model is efficient than Pooled model

Since the p value is very small, we reject the Null Hypothesis. And so alternate hypothesis is accepted which is to accept Random Model is better than Pooled Model
Random vs Fixed 

Null Hypothesis: No Correlation . Random Model
Alternate Hypothesis: Fixed Model

Since the p value is negligible so we reject the Null Hypothesis and hence Alternate hypothesis is accepted which is to accept Fixed Model.
Conclusion: 

  The analysis proves that Fixed Model is the best suited to do the panel data analysis for "Produc" data set. And also we observe that within the same id the variation is null.