../_images/calypso.png

toto.plugins.statistics.common_statistics

class toto.plugins.statistics.common_statistics.Statistics(pandas_obj)[source]

Bases: object

Directional_statistics(mag='mag', drr='drr', args={'Percentile or Quantile': 0.1, 'direction binning': {'centered': True, 'not-centered': False}, 'direction interval': 45.0, 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'function': {'Max': True, 'Mean': False, 'Median': False, 'Min': False, 'Percentile': False, 'Prod': False, 'Quantile': False, 'Std': False, 'Sum': False, 'Var': False}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]

Extract statistics for the selected directionnal bins

Parameters

magstr

Name of the column from which to get stats.

drrstr

Column name representing the directions.

args: dict
Dictionnary with the folowing keys:
function: str

Name of the function to use, can be Max, Mean, Median, Min, Percentile Prod, Quantile, Std, Sum, Var

Percentile or Quantile: float

Percentile or Quantile value depending on the function

direction binning: str

Can be centered or not-centered depending if the directionnal are centered over 0

direction interval: int

Dirctionnal interval for the bins in degrees

folder out: str

Path to save the output

Time blocking: str
if Time blocking=='Yearly',

Statistics will be calculated for the whole timeserie

if Time blocking=='South hemisphere(Summer/Winter)',

Statistics will be calculated for South hemisphere summer and winter seasons

if Time blocking=='South hemisphere 4 seasons',

Statistics will be calculated for each South hemisphere seasons

if Time blocking=='North hemishere(Summer/Winter)',

Statistics will be calculated for North hemisphere summer and winter seasons

if Time blocking=='North hemisphere 4 seasons',

Statistics will be calculated for each North hemisphere seasons

if Time blocking=='North hemisphere moosoon(SW,NE,Hot season)',

Statistics will be calculated for the North hemisphere moonsoon seasons

Examples:

>>> df=tf['test1']['dataframe'].Statistics.Directional_statistics(mag='U',drr='drr',args={'direction interval':45,Time blocking':'Yearly'})
>>> 

Outputs:

Directionnal statistics example

MEAN

N

S

E

W

Total

January

February

Annual

common_statistics(mag=['mag'], drr='drr', args={'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'minimum occurrence (main direction) [%]': 15, 'stats': 'n min max mean std [1,5,10,50,90,95,99]', 'time blocking': {'north hemishere(Summer/Winter)': False, 'north hemisphere 4 seasons': False, 'north hemisphere moosoon(SW,NE,Hot season)': False, 'south hemisphere 4 seasons': False, 'south hemisphere(Summer/Winter)': False, 'yearly': True}})[source]

Extract statistics from a Panda dataframe column

Parameters

magstr

Name of the column from which to get stats. Can be a list for extracting stats from multilple columns.

drrstr, optional

Column name representing the directions.

args: dict

Dictionnary with the folowing keys:

minimum occurrence (main direction) [%]: int

Use to calculate the main direction. Main direction is when occurence>= Minimum occurrence. Default is 15

folder out: str

Path to save the output

time blocking: str
if time blocking=='yearly',

Statistics will be calculated for the whole timeserie

if time blocking=='south hemisphere(Summer/Winter)',

Statistics will be calculated for South hemisphere summer and winter seasons

if time blocking=='south hemisphere 4 seasons',

Statistics will be calculated for each South hemisphere seasons

if time blocking=='north hemishere(Summer/Winter)',

Statistics will be calculated for North hemisphere summer and winter seasons

if time blocking=='north hemisphere 4 seasons',

Statistics will be calculated for each North hemisphere seasons

if time blocking=='north hemisphere moosoon(SW,NE,Hot season)',

Statistics will be calculated for the North hemisphere moonsoon seasons

stats: str

string containing the name of the stats to do (must be numpy function) exemple: n min max mean std [1,5,10,50,90,95,99], where:

  • n is for number of sample

  • Put exceedence values in []

Examples:

>>> df=tf['test1']['dataframe'].Statistics.common_stats(mag='U',drr='drr',args={'time blocking':'Yearly'})
>>> 

Outputs:

Common statistics example

N

min

max

mean

std

P1

P90

Main Direction

June

July

Winter

Total

comparison_statistics(measured='measured', hindcast='hindcast', args={'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs'})[source]

Extract comparions statistics such as BIAS,MAE,RMSE,MRAE

Parameters

measuredstr

Name of the column representing the measure data.

hindcaststr

Name of the column representing the hindcast data.

args: dict
Dictionnary with the folowing keys:
folder out: str

Path to save the output

Examples:

>>> df=tf['test1']['dataframe'].Statistics.comparison_statistics(measured='U',hindcast='u',args={'folder out':'/tmp'})
>>> 

Outputs:

Comparison statistics example

MAE

Mean Absolute Error

RMSE

Root Mean Square Error

MRAE

Mean Relative Absolute Error

BIAS

BIAS

SI

Scatter Index

IOA

Index of Agreement

excedence_coincidence_probability(data='data', coincident_nodir='coincident_nodir', coincident_with_dir='coincident_with_dir', args={'Coincidence bins: Min Res Max(optional)': [0, 2], 'Duration Min Res Max': [6, 6, 72], 'Exceedance bins: Min Res Max(optional)': [0, 2], 'direction binning': {'centered': True, 'not-centered': False}, 'direction interval': 45.0, 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'exceedence': True, 'non-exceedence': False}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]

Exceedence and non-exceedence analysis co-incident with another parameter, similar to Joint-probability function but includes a cumulative sum to obtain exceedence or non-exceedence(in %).

Parameters

datastr

Name of the column from which to get stats.

coincident_with_dirstr

Column name representing the directions.

coincident_nodirstr

Column name representing another magnitude.

args: dict
Dictionnary with the folowing keys:
method: str

Name of the method to use, can be: exceedence non-exceedence

direction binning: str

Can be centered or not-centered depending if the directionnal are centered over 0

direction interval: int

Dirctionnal interval for the bins in degrees

folder out: str

Path to save the output

Probablity expressed in: str

This can be percent or per thoushand

Exceedance bins: Min Res Max(optional): list

Minimum, resolution and maximum value of X axis use in the join probability

Coincidence bins: Min Res Max(optional): list

Minimum, resolution and maximum value of Y axis use in the join probability

Time blocking: str
if Time blocking=='Annual',

Statistics will be calculated for the whole timeserie

if Time blocking=='Seasonal (South hemisphere)',

Statistics will be calculated for each South hemisphere seasons

if Time blocking=='Seasonal (North hemisphere)',

Statistics will be calculated for each North hemisphere seasons

if Time blocking=='Monthly',

Statistics will be calculated for each month

Examples:

>>> df=tf['test1']['dataframe'].Statistics.excedence_coincidence_probability(data='U',coincident_with_dir='drr',args={'direction interval':45,Time blocking':'Yearly'})
>>> 

Outputs:

Excedence coincidence probability

exceedence %

0.0-0.2

0.2-0.4

0.4-0.6

0.6-0.8

Total

>0.0

>0.2

>0.4

exceedence_probability(data='data', args={'duration Min Res Max': [6, 6, 72], 'exceedance bins: Min Res Max(optional)': [2, 1, 22], 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'exceedence': False, 'non-exceedence': False, 'persistence exceedence': True, 'persistence non-exceedence': False}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]

This function calculates the frequency of occurrence of data: -exceeding specific values (exceedence) -non-exceeding specific values (non-exceedence) -exceeding specific values during a specific duration (persistence exceedence) -non-exceeding specific values during a specific duration (persistence non-exceedence)

Parameters

datastr

Name of the column from which to get stats.

args: dict
Dictionnary with the folowing keys:
method: str

It can be exceedence,`non-exceedence`, persistence exceedence or persistence non-exceedence

exceedance bins: Min Res Max(optional): list

Minimum, resolution and maximum value of X axis to use

duration Min Res Max: list

Minimum, resolution and maximum duration to use in hours

folder out: str

Path to save the output

time blocking: str
if Time blocking=='Annual',

Statistics will be calculated for the whole timeserie

if Time blocking=='Seasonal (South hemisphere)',

Statistics will be calculated for each South hemisphere seasons

if Time blocking=='Seasonal (North hemisphere)',

Statistics will be calculated for each North hemisphere seasons

if Time blocking=='Monthly',

Statistics will be calculated for each month

Examples:

>>> df=tf['test1']['dataframe'].Statistics.weather_window(data='U',args={'time blocking':'Monthly'})
>>> 

Outputs:

Weather_window example

6

12

18

24

36

>0.2

>0.4

>0.6

joint_probability(mag='speed', drr='direction', period='period', args={'X Min Res Max(optional)': [2, 1, 22], 'Y Min Res Max(optional)': [0, 0.5], 'direction binning': {'centered': True, 'not-centered': False}, 'direction interval': 45.0, 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'Mag vs Dir': True, 'Mag vs Per': False, 'Per Vs Dir': False}, 'probablity expressed in': {'per thoushand': True, 'percent': False}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]

This function provides joint distribution tables for X and Y, i.e. the probability of events defined in terms of both X and Y (per 1000) It can be applied for magnitude-direction, magnitude-period or period-direction

Parameters

magstr

Name of the column from which to get stats.

drrstr

Column name representing the directions. If method is Per Vs Dir or Mag vs Dir

periodstr

Column name representing the period. If method is Per Vs Dir or Mag vs Per

args: dict
Dictionnary with the folowing keys:
method: str

Name of the method to use, can be: Mag vs Dir: Plot Maginitude Versus Direction Per Vs Dir: Plot Period Versus Direction Mag vs Per: Plot Maginitude Versus Period

direction binning: str

Can be centered or not-centered depending if the directionnal are centered over 0

direction interval: int

Dirctionnal interval for the bins in degrees

folder out: str

Path to save the output

probablity expressed in: str

This can be percent or per thoushand

X Min Res Max(optional): list

Minimum, resolution and maximum value of X axis use in the join probability

Y Min Res Max(optional): list

Minimum, resolution and maximum value of Y axis use in the join probability

Time blocking: str
if Time blocking=='Annual',

Statistics will be calculated for the whole timeserie

if Time blocking=='Seasonal (South hemisphere)',

Statistics will be calculated for each South hemisphere seasons

if Time blocking=='Seasonal (North hemisphere)',

Statistics will be calculated for each North hemisphere seasons

if Time blocking=='Monthly',

Statistics will be calculated for each month

Examples:

>>> df=tf['test1']['dataframe'].Statistics.joint_probability(mag='U',drr='drr',args={'direction interval':45,Time blocking':'Yearly'})
>>> 

Outputs:

Joint probability example

January

0

1

2

3

Total

0

1

2

Total

100

modal_wave_period(Hs='Hs', Tp='Tp', args={'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'time blocking': {'North hemishere(Summer/Winter)': False, 'North hemisphere 4 seasons': False, 'North hemisphere moosoon(SW,NE,Hot season)': False, 'South hemisphere 4 seasons': False, 'South hemisphere(Summer/Winter)': True}})[source]

This function computes the modal period for a set of hs/tp The modal period is taken as the mean period of the top 5% of wave height

Parameters

Hsstr

Name of the column containing significant wave height.

Tp: str

Name of the column containing the wave period.

args: dict
Dictionnary with the folowing keys:
folder out: str

Path to save the output

Time blocking: str
if Time blocking=='South hemisphere(Summer/Winter)',

Statistics will be calculated for South hemisphere summer and winter seasons

if Time blocking=='South hemisphere 4 seasons',

Statistics will be calculated for each South hemisphere seasons

if Time blocking=='North hemishere(Summer/Winter)',

Statistics will be calculated for North hemisphere summer and winter seasons

if Time blocking=='North hemisphere 4 seasons',

Statistics will be calculated for each North hemisphere seasons

if Time blocking=='North hemisphere moosoon(SW,NE,Hot season)',

Statistics will be calculated for the North hemisphere moonsoon seasons

Examples:

>>> df=tf['test1']['dataframe'].Statistics.modal_wave_period(Hs='hs',Tp='tp')
>>> 

Outputs:

Modal wave period probability

Modal wave period

January

February

wave_population(Hs='Hs', Tm02='Tm02', Drr_optional='Drr_optional', Tp_optional='Tp_optional', SW_optional='SW', args={'Exposure (years) (= length of time series if not specified)': 0, 'Heigh bin size': 0.5, 'Method': {'Height only': True, 'Height/Direction': False, 'Height/Tp': False, 'Height/period': False}, 'Period bin size': 2, 'direction binning': {'centered': True, 'not-centered': False}, 'direction interval': 45.0, 'directional switch': {'Off': False, 'On': True}, 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs'})[source]

This function computes the wave population for fatigue analysis - Based on Rayleigh distribution if spectral width parameter (SW) is not

specified.

  • Based on Longuet-Higgins Hs-Tp joint probability distribution if SW is specified

Parameters

Hsstr

Name of the column containing significant wave height.

Tm02: str

Name of the column containing the mean wave period using spectral moments of order 0 and

Drr_optional: str

Optional column containing the direction

Tp_optional: str

Optional column containing the wave period

SW_optional: str

Optional column containing the spectral width parameter

args: dict
Dictionnary with the folowing keys:
Method: str

Name of the method to use, can be: Height only Height/Direction Height/Tp Height/period

direction binning: str

Can be centered or not-centered depending if the directionnal are centered over 0

direction interval: int

Dirctionnal interval for the bins in degrees

Heigh bin size: float

Interval in meter for Hs

Period bin size’: float

Interval in second for the period

Exposure (years) (= length of time series if not specified): int

Number of years use, length of time series if not specified

folder out: str

Path to save the output

directional switch: str

Can be On or Off to use direction

Examples:

>>> df=tf['test1']['dataframe'].Statistics.wave_population(data=['hs','tp'],args={'Threshold':[2,15],Time blocking':'Yearly'})
>>> 

Outputs:

Workability probability

Omni

N

S

E

W

> 0.0 <= 0.1

> 0.1 <= 0.2

> 0.2 <= 0.3

Total

weather_window(data='data', args={'Duration Min Res Max': [6, 6, 72], 'Exceedance bins: Min Res Max(optional)': [2, 1, 22], 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'persistence exceedence': False, 'persistence non-exceedence': True}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]

This function calculates the averaged number of full windows for data -exceeding specific values during a specific duration (persistence exceedence) -non-exceeding specific values during a specific duration (persistence non-exceedence) Note: if a window overlaps to the next month/season/year, it is assumed to belong to the month/season/year when the window starts.

Parameters

datastr

Name of the column from which to get stats.

args: dict
Dictionnary with the folowing keys:
method: str

It can be persistence exceedence or persistence non-exceedence

Exceedance bins: Min Res Max(optional): list

Minimum, resolution and maximum value of X axis to use

Duration Min Res Max: list

Minimum, resolution and maximum duration to use in hours

folder out: str

Path to save the output

Time blocking: str
if Time blocking=='Annual',

Statistics will be calculated for the whole timeserie

if Time blocking=='Seasonal (South hemisphere)',

Statistics will be calculated for each South hemisphere seasons

if Time blocking=='Seasonal (North hemisphere)',

Statistics will be calculated for each North hemisphere seasons

if Time blocking=='Monthly',

Statistics will be calculated for each month

Examples:

>>> df=tf['test1']['dataframe'].Statistics.weather_window(data='U',args={'time blocking':'Monthly'})
>>> 

Outputs:

Weather_window example

6

12

18

24

36

>0.2

>0.4

>0.6

weighted_direction(Hs='Hs', drr='drr', args={'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'time blocking': {'North hemishere(Summer/Winter)': False, 'North hemisphere 4 seasons': False, 'North hemisphere moosoon(SW,NE,Hot season)': False, 'South hemisphere 4 seasons': False, 'South hemisphere(Summer/Winter)': True}})[source]

This function computes the energy weighted-dreiction based on input timeseries of Hs and Dir

Parameters

Hsstr

Name of the column containing significant wave height.

drr: str

Name of the column containing the direction.

args: dict
Dictionnary with the folowing keys:
folder out: str

Path to save the output

Time blocking: str
if Time blocking=='South hemisphere(Summer/Winter)',

Statistics will be calculated for South hemisphere summer and winter seasons

if Time blocking=='South hemisphere 4 seasons',

Statistics will be calculated for each South hemisphere seasons

if Time blocking=='North hemishere(Summer/Winter)',

Statistics will be calculated for North hemisphere summer and winter seasons

if Time blocking=='North hemisphere 4 seasons',

Statistics will be calculated for each North hemisphere seasons

if Time blocking=='North hemisphere moosoon(SW,NE,Hot season)',

Statistics will be calculated for the North hemisphere moonsoon seasons

Examples:

>>> df=tf['test1']['dataframe'].Statistics.modal_wave_period(Hs='hs',Tp='tp')
>>> 

Outputs:

Workability probability

Energy weighted direction

January

February

workability(variables=['data1'], args={'duration min res max': [6, 6, 72], 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'persistence exceedence': True, 'persistence non-exceedence': False}, 'threshold for each dataset': [1, 10], 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]

This function provides workability persistence (non-)exceedence tables, i.e. the % of workable time based on limiting paramters (e.g. Hs < 2m and Wind speed < 10 m/s)

Parameters

variableslist

Name of the column use to create the conditions .

args: dict
Dictionnary with the folowing keys:
method: str

Name of the method to use, can be: persistence exceedence default persistence non-exceedence

threshold for each dataset: list

list of threshold to use for each of the paramater listed in data. data and Threshold must have the same length

duration min res max: int

Duration interval in hours

folder out: str

Path to save the output

Time blocking: str
if Time blocking=='Annual',

Statistics will be calculated for the whole timeserie

if Time blocking=='Seasonal (South hemisphere)',

Statistics will be calculated for each South hemisphere seasons

if Time blocking=='Seasonal (North hemisphere)',

Statistics will be calculated for each North hemisphere seasons

if Time blocking=='Monthly',

Statistics will be calculated for each month

Examples:

>>> df=tf['test1']['dataframe'].Statistics.workability(data=['hs','tp'],args={'Threshold':[2,15],Time blocking':'Yearly'})
>>> 

Outputs:

Workability probability

>6

>12

>18

>24

>36

January

February