# x-y scatter plot with correlated random numbers

In this module:

- Random number generator - using volatile functions - with number of observations \(n\) switch
- Dynamic range as a statistical data reference
- Dynamic range as a chart series
- Creating an X Y scatter chart

## Random number generator

Pseudo random numbers \(\text{N(0,1) i.i.d.}\) are generated by the Excel **RAND **function and then passed as the probability argument to the **NORM.S.INV** function. The correlation structure is incorporated by using the Cholesky decomposition method.

### Cholesky decomposition

Generate \(Z_1,Z_2\)

- Generate \(Z_1,Z_2 \sim \text{N(0,1) i.i.d.}\)
- Set \(X=Z_1\) and \(Y=\rho Z_1 + Z_2 \sqrt{1-\rho^2}\), where \(\rho\) is the correlation coefficient

### The X Y data worksheet

\(Z_1,Z_2\) are generated on the worksheet named: `X Y data`

. Pseudo random numbers \(\text{N(0,1) i.i.d.}\) are generated with the nested `NORM.S.INV(RAND())`

formula.

- Cell
**B3:**`=IF(ROW()-2<=NoObs,NORM.S.INV(RAND()),"")`

- Cell
**C3:**`=IF(ROW()-2<=NoObs,rho*B3+SQRT(1-rho^2)*NORM.S.INV(RAND()),"")`

where`rho`

and`NoObs`

are links to the selector and validator panel - Copy
`B3:C3`

and paste to the range`B4:B5002`

. Name`B3:B5002`

as X, and`C3:C5002`

as Y

## Summary statistics and the X-Y scatter plot

Refer to the X Y Interface worksheet shown in the figure 1 Excel Web App #1.

### The selector / validator panel

**The correlation selector**

**D5**: the correlation selector - is set up by Data Validation- Select
- In the
**Settings**tab
dialog box - - Set Validation criteria; Allow:
`Decimal`

; Data:`between`

; Minimum:`-1`

; Maximum`1`

- In the Data Validation dialog box -
**Input Message**tab - Set Title:
`Correlation value`

; Input message:`Enter rho in the range ... -1 <= rho <= +1`

**The "No of points" validator**

**D6**: the "No of point"s validator - is set up by Data Validation- Select
- In the
**Setting**tab
dialog box - - Set Validation Criteria; Allow:
`List`

; Source:`625,1250,2500,5000`

- In the Data Validation dialog box -
**Input Message**tab - Set Title:
`Number of observations`

; Input message:`625, 1250, 2500, 5000, ... Select 625 to reduce worksheet recalculation time`

### The descriptive statistics panel

Note: INDIRECT(D$10) uses the value in cell D10, X, to point to the range named X.

Formulas in the summary statistics section:

**C11:**`=MIN(INDIRECT(D$10)`

**C12:**`=MAX(INDIRECT(D$10)`

**C13:**`=AVERAGE(INDIRECT(D$10)`

**C14:**`=VAR.S(INDIRECT(D$10)`

**C15:**`=STDEV.S(INDIRECT(D$10)`

**C16:**`=SKEW(INDIRECT(D$10)`

**C17:**`=KURT(INDIRECT(D$10)`

**C18:**`=COUNT(INDIRECT(D$10)`

**C19:**`=MAX(INDIRECT(D$10)-MIN(INDIRECT(D$10)`

**C21:**`=COVARIANCE.S(X,Y)`

**C22:**`=CORREL(X,Y)`

**C23:**`=COVARIANCE.S(X,Y)/(STDEV.S(X)*STDEV.S(Y))`

### The X Y scatter plot

- To select the data, type
`XV,YV`

in the**Name Box**. If you are using the Web App file, you will need to**Unhide**the`X Y data`

worksheet before you select the data. Create the chart on that worksheet, then move it to the`X Y interface`

worksheet **Insert the chart**- follow the ribbon sequence: . Click**OK**to display the chart shown in figure 2- Using the sequence does not retain the reference to the dynamic range names, instead a static name is applied (see figure 4). When a vector length shorter than the static range is selected, the X Y chart collapses as shown in figure 3. The x axis (the Horizontal (Value) Axis) displays the observation numbers, with the observation 626 onwards displaying as blank (the static vector being 1,250 data points in length)
- To display the
**Edit Series**dialog box, use the ribbon sequence to display the**Select Data Source**dialog box, then, in**Legend Entries (Series)**select`Series1`

, then click**Edit**. The**Edit Series**dialog box will now be displayed - figure 4 - In step 1, the dynamic ranges
`XY`

and`YV`

were selected, but these were overwritten by Series X values:`='X Y data!$B$3:$B$1252`

and Series Y values:`='X Y data!$C$3:$C$1252`

as shown in the Edit Series dialog box in figure 4. The series reference need to be replaced with links to the dynamic vectors **Link the chart series to the dynamic range**- by editing the series values. Set Series X values:`='xlf-x-y-scatter.xlsx'!XV`

, and Series Y values:`='xlf-x-y-scatter.xlsx'!YV`

**Format the X Y chart**- from the tab, select then the element.- From the same ribbon tab, select
`Height: 12.5 cm`

, and`Width: 12.5 cm`

, and set - A two vector X Y plot has only one series, and the
**Series name**value in figure 4 was left blank, thus the default name is**Series1**. Select**Series1**from the - Click to display the Format Data Series task pane. This is the Office 2013 implementation of an Office 2010 dialog box. An example is shown in figure 5
- Select
**Built-in**, and set Size:`2`

, then - Select
**Format Axis**task pane
. This will display the - Select
`-4.0`

, and Maximum:`4.0`

. Vertical axis crosses; Axis value:`0.0`

(the three column icon), then set Bounds; Minimum: - Repeat step 10 for the
**Vertical (value) Axis**. Select the axis from the Axis Options > Elements List box - figure 6 **Add a trendline**- to display the Format Trendline task pane.- Select the
**Fill & Line**tab, then set Line:`Solid`

; Color:`Standard Colors, Red`

; Width:`1.5 pt`

; Dash type:`Solid`

; then**Close**the task pane. **Hide the chart title**-- Cut and paste the chart to the
`X Y interface`

worksheet - The completed chart appears in figure 7

Selected Excel functions used in this module and associated statistical functions.

Excel functions | Description |
---|---|

NORM.S.INV(probability) [2010] | Returns the inverse of the standard normal cumulative distribution \(\text{N(0,1) i.i.d.}\). |

RAND() | Returns an evenly distributed random number \(0 \le x \lt 1\). The rand function takes no arguments |

ROW([reference]) | Returns the row number of a reference |

SQRT(number) | Returns the positive square root of a number \(\sqrt{x}; \quad x^{0.5} \) |

- This example was developed in Excel 2013 Pro 64 bit.

**Published:**2 September 2015**Revised:**Thursday 13th of February 2020 - 11:48 AM, [Australian Eastern Standard Time (EST)]