Thursday, August 7, 2008

Triangular Plot


A triangular plot is a graph of a system of three variables. It is most often used in geologic studies to show the relative compositions of soils and rocks, but it can be more generally applied to any system of three variables. The proportions of the three variables plotted always sum to some constant. This particular triangular plot is of a general election in the U.K. in 2005. It was done previous to the election to estimate who would win.

Histogram


The histogram is a graphical way to display frequencies of scores. Basically, the data is collected and is then subdivided into bin widths of equal range. The number of scores to fall into each bin width is counted, and the frequencies are then graphed.. As with any statistical graph, with a large enough sample size the graph should reflect a normal distribution; therefore, if the graph does not look like a normal bell curve there may be some significant reason behind it.

Box Plot


The box plot is a statistical tool for graphically condensing an entire data set into a five number summary. The box plot requires the minimum value of the entire data set; the median; the maximum value; and the first and third quartile values. More specifically, the first and third quartiles are the midway points between the minimum and median and between the median and maximum, respectively. The box plot is a very simple way to see whether a data set is skewed one way or the other, just by comparing the distance between the median and the first quartile with the distance between the median and the third quartile. Another major advantage of the box plot is that it differentiates between the majority of the data and the outliers, thus making the visual representation more reliable than certain other methods of graphing data.

Stem and Leaf Plots


Stem and leaf plots are an efficient way to display data. If you were to imagine that 10 people sitting around a table at a family reunion. Their ages are 32, 1, 45, 37, 8, 9, 55, 81, 34, and 51. You might make a stem and leaf plot using the tens digits from their ages as the "stem" and the ones digits as the "leaves." The plot would look like the picture that is found here.

Similarity Matrix


A similarity matrix is a matrix of scores which express the similarity between two data points. Similarity matrices are used in sequence alignment. Higher scores are given to more-similar characters, and lower or negative scores for dissimilar characters. This similarity matrix compares a recording of TV stream with itself. We can observe that programs can be easily localized because they correspond to blocs of high coefficients around the first diagonal.

Correlation Matrix


The picture here is an example of a correlation matrix. It is a correlation matrix of 20 climate model biases at certain spatial locations. Each row and column of the matrix represents one of the 20 models in the study, consequently the main diagonal of the matrix is the correlation of the model with itself, and is always unity. Everything off the main diagonal (how one model is correlated to another) is the focus of this study.

Star Plots


Star plots are a useful way to display multivariate observations with an arbitrary number of variables. Each observation is represented as a star-shaped figure with one ray for each variable. For a given observation, the length of each ray is made proportional to the size of that variable. This particular star plot is of automobile data. Each star represents one car model; each ray in the star is proportional to one variable.