**Introduction to Statistics**

**Introduction to Statistics**

– A study dealing with the collection, presentation and interpretation and analysis of data is called as statistics.

**Data**

- Facts /figures numerical or otherwise collected for a definite purpose is called as data.
- data collected first-hand data:- Primary
- Secondary data: Data collected from a source that already had data stored

**Frequency**

– The number of times a particular instance occurs is called frequency in statistics.

**Ungrouped data**

Ungrouped data is data in its original or raw form. The observations are not classified in groups.

**Grouped data**

In grouped data, observations are organized in groups.

**Class Interval**

- The size of the class into which a particular data is divided.
- E.g divisions on a histogram or bar graph.
**Class width**= upper class limit – lower class limit

**Regular and Irregular class interval**

- Regular class interval: When the class intervals are equal or of the same sizes.
- E.g 0-10, 10-20, 20-30….. 90-100
- Irregular class interval: When the class intervals are of varying sizes.
- E.g 0-35, 35-45, 45-55, 55- 80, 80-90, 90-95, 95-100

**Frequency table**

– A frequency table or distribution shows the occurrence of a particular variable in a tabular form.

**Sorting**

- Raw data needs to be sorted in order to carry out operations.-
- Sorting ⇒ ascending order or descending order

**Ungrouped frequency table**

– When the frequency of each class interval is not arranged or organised in any manner.

**Grouped frequency table**

– The frequencies of the corresponding class intervals are organised or arranged in a particular manner, either ascending or descending.

Graphical Representation of Data

**Bar graphs**

Graphical representation of data using bars of equal width and equal spacing between them (on one axis). The height

Savings (in percentage) |
Number of Employees (Frequency) |

20 | 105 |

30 | 199 |

40 | 29 |

50 | 73 |

Total | 400 |

The data can be represented as:

**Variable being a number**

- A variable can be a number such as ‘no. of students’ or ‘no. of months’.
- Can be represented by bar graphs or histograms depending on the type of data.

Discrete → bar graphs

Continuous → Histograms

**Histograms**

- Like bar graphs, but for continuous class intervals.
- Area of each rectangle is ∝ Frequency of a variable and the width is equal to the class interval.

**Frequency polygon**

- If the midpoints of each rectangle in a histogram are joined by line segments, the figure formed will be a frequency polygon.
- Can be drawn without histogram. Need midpoints of class intervals

**Midpoint of class interval**

The midpoint of the class interval is called a class mark

Class mark = (Upper limit + Lower limit)/2

**Equality of areas**

– Addition of two class intervals with zero frequency preceding the lowest class and succeeding the highest class intervals enables to equate the area of the frequency polygon to that of the histogram(Using congruent triangles.)

Measures of Central Tendency

**Average**

– The average of a number of observations is the sum of the values of all the observations divided by the total number of observations.

**Mean**

**Mode**

- The most frequently occurring observation is called the mode.
- The class interval with the highest frequency is the modal class

**Median**

- Value of the middlemost observation.
- If n(number of observations) is odd, Median =[(n+1)/2]
^{th}observation. - If n is even, the Median is the mean or average of (n/2)
^{th }and [(n+1)/2]^{th }observation.