toto.plugins.statistics.common_statistics¶
- class toto.plugins.statistics.common_statistics.Statistics(pandas_obj)[source]¶
Bases:
object- Directional_statistics(mag='mag', drr='drr', args={'Percentile or Quantile': 0.1, 'direction binning': {'centered': True, 'not-centered': False}, 'direction interval': 45.0, 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'function': {'Max': True, 'Mean': False, 'Median': False, 'Min': False, 'Percentile': False, 'Prod': False, 'Quantile': False, 'Std': False, 'Sum': False, 'Var': False}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]¶
Extract statistics for the selected directionnal bins
Parameters¶
- magstr
Name of the column from which to get stats.
- drrstr
Column name representing the directions.
- args: dict
- Dictionnary with the folowing keys:
- function: str
Name of the function to use, can be Max, Mean, Median, Min, Percentile Prod, Quantile, Std, Sum, Var
- Percentile or Quantile: float
Percentile or Quantile value depending on the function
- direction binning: str
Can be centered or not-centered depending if the directionnal are centered over 0
- direction interval: int
Dirctionnal interval for the bins in degrees
- folder out: str
Path to save the output
- Time blocking: str
- if
Time blocking=='Yearly', Statistics will be calculated for the whole timeserie
- if
Time blocking=='South hemisphere(Summer/Winter)', Statistics will be calculated for South hemisphere summer and winter seasons
- if
Time blocking=='South hemisphere 4 seasons', Statistics will be calculated for each South hemisphere seasons
- if
Time blocking=='North hemishere(Summer/Winter)', Statistics will be calculated for North hemisphere summer and winter seasons
- if
Time blocking=='North hemisphere 4 seasons', Statistics will be calculated for each North hemisphere seasons
- if
Time blocking=='North hemisphere moosoon(SW,NE,Hot season)', Statistics will be calculated for the North hemisphere moonsoon seasons
- if
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.Directional_statistics(mag='U',drr='drr',args={'direction interval':45,Time blocking':'Yearly'}) >>>
Outputs:¶
Directionnal statistics example¶ MEAN
N
S
E
W
Total
January
February
Annual
- common_statistics(mag=['mag'], drr='drr', args={'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'minimum occurrence (main direction) [%]': 15, 'stats': 'n min max mean std [1,5,10,50,90,95,99]', 'time blocking': {'north hemishere(Summer/Winter)': False, 'north hemisphere 4 seasons': False, 'north hemisphere moosoon(SW,NE,Hot season)': False, 'south hemisphere 4 seasons': False, 'south hemisphere(Summer/Winter)': False, 'yearly': True}})[source]¶
Extract statistics from a Panda dataframe column
Parameters¶
- magstr
Name of the column from which to get stats. Can be a list for extracting stats from multilple columns.
- drrstr, optional
Column name representing the directions.
- args: dict
Dictionnary with the folowing keys:
- minimum occurrence (main direction) [%]: int
Use to calculate the main direction. Main direction is when occurence>= Minimum occurrence. Default is 15
- folder out: str
Path to save the output
- time blocking: str
- if
time blocking=='yearly', Statistics will be calculated for the whole timeserie
- if
time blocking=='south hemisphere(Summer/Winter)', Statistics will be calculated for South hemisphere summer and winter seasons
- if
time blocking=='south hemisphere 4 seasons', Statistics will be calculated for each South hemisphere seasons
- if
time blocking=='north hemishere(Summer/Winter)', Statistics will be calculated for North hemisphere summer and winter seasons
- if
time blocking=='north hemisphere 4 seasons', Statistics will be calculated for each North hemisphere seasons
- if
time blocking=='north hemisphere moosoon(SW,NE,Hot season)', Statistics will be calculated for the North hemisphere moonsoon seasons
- if
- stats: str
string containing the name of the stats to do (must be numpy function) exemple:
n min max mean std [1,5,10,50,90,95,99], where:n is for number of sample
Put exceedence values in
[]
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.common_stats(mag='U',drr='drr',args={'time blocking':'Yearly'}) >>>
Outputs:¶
Common statistics example¶ N
min
max
mean
std
P1
P90
Main Direction
June
July
Winter
Total
- comparison_statistics(measured='measured', hindcast='hindcast', args={'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs'})[source]¶
Extract comparions statistics such as BIAS,MAE,RMSE,MRAE
Parameters¶
- measuredstr
Name of the column representing the measure data.
- hindcaststr
Name of the column representing the hindcast data.
- args: dict
- Dictionnary with the folowing keys:
- folder out: str
Path to save the output
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.comparison_statistics(measured='U',hindcast='u',args={'folder out':'/tmp'}) >>>
Outputs:¶
Comparison statistics example¶ MAE
Mean Absolute Error
RMSE
Root Mean Square Error
MRAE
Mean Relative Absolute Error
BIAS
BIAS
SI
Scatter Index
IOA
Index of Agreement
- excedence_coincidence_probability(data='data', coincident_nodir='coincident_nodir', coincident_with_dir='coincident_with_dir', args={'Coincidence bins: Min Res Max(optional)': [0, 2], 'Duration Min Res Max': [6, 6, 72], 'Exceedance bins: Min Res Max(optional)': [0, 2], 'direction binning': {'centered': True, 'not-centered': False}, 'direction interval': 45.0, 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'exceedence': True, 'non-exceedence': False}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]¶
Exceedence and non-exceedence analysis co-incident with another parameter, similar to Joint-probability function but includes a cumulative sum to obtain exceedence or non-exceedence(in %).
Parameters¶
- datastr
Name of the column from which to get stats.
- coincident_with_dirstr
Column name representing the directions.
- coincident_nodirstr
Column name representing another magnitude.
- args: dict
- Dictionnary with the folowing keys:
- method: str
Name of the method to use, can be: exceedence non-exceedence
- direction binning: str
Can be centered or not-centered depending if the directionnal are centered over 0
- direction interval: int
Dirctionnal interval for the bins in degrees
- folder out: str
Path to save the output
- Probablity expressed in: str
This can be percent or per thoushand
- Exceedance bins: Min Res Max(optional): list
Minimum, resolution and maximum value of X axis use in the join probability
- Coincidence bins: Min Res Max(optional): list
Minimum, resolution and maximum value of Y axis use in the join probability
- Time blocking: str
- if
Time blocking=='Annual', Statistics will be calculated for the whole timeserie
- if
Time blocking=='Seasonal (South hemisphere)', Statistics will be calculated for each South hemisphere seasons
- if
Time blocking=='Seasonal (North hemisphere)', Statistics will be calculated for each North hemisphere seasons
- if
Time blocking=='Monthly', Statistics will be calculated for each month
- if
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.excedence_coincidence_probability(data='U',coincident_with_dir='drr',args={'direction interval':45,Time blocking':'Yearly'}) >>>
Outputs:¶
Excedence coincidence probability¶ exceedence %
0.0-0.2
0.2-0.4
0.4-0.6
0.6-0.8
Total
>0.0
>0.2
>0.4
- exceedence_probability(data='data', args={'duration Min Res Max': [6, 6, 72], 'exceedance bins: Min Res Max(optional)': [2, 1, 22], 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'exceedence': False, 'non-exceedence': False, 'persistence exceedence': True, 'persistence non-exceedence': False}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]¶
This function calculates the frequency of occurrence of data: -exceeding specific values (exceedence) -non-exceeding specific values (non-exceedence) -exceeding specific values during a specific duration (persistence exceedence) -non-exceeding specific values during a specific duration (persistence non-exceedence)
Parameters¶
- datastr
Name of the column from which to get stats.
- args: dict
- Dictionnary with the folowing keys:
- method: str
It can be exceedence,`non-exceedence`, persistence exceedence or persistence non-exceedence
- exceedance bins: Min Res Max(optional): list
Minimum, resolution and maximum value of X axis to use
- duration Min Res Max: list
Minimum, resolution and maximum duration to use in hours
- folder out: str
Path to save the output
- time blocking: str
- if
Time blocking=='Annual', Statistics will be calculated for the whole timeserie
- if
Time blocking=='Seasonal (South hemisphere)', Statistics will be calculated for each South hemisphere seasons
- if
Time blocking=='Seasonal (North hemisphere)', Statistics will be calculated for each North hemisphere seasons
- if
Time blocking=='Monthly', Statistics will be calculated for each month
- if
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.weather_window(data='U',args={'time blocking':'Monthly'}) >>>
Outputs:¶
Weather_window example¶ 6
12
18
24
36
>0.2
>0.4
>0.6
- joint_probability(mag='speed', drr='direction', period='period', args={'X Min Res Max(optional)': [2, 1, 22], 'Y Min Res Max(optional)': [0, 0.5], 'direction binning': {'centered': True, 'not-centered': False}, 'direction interval': 45.0, 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'Mag vs Dir': True, 'Mag vs Per': False, 'Per Vs Dir': False}, 'probablity expressed in': {'per thoushand': True, 'percent': False}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]¶
This function provides joint distribution tables for X and Y, i.e. the probability of events defined in terms of both X and Y (per 1000) It can be applied for magnitude-direction, magnitude-period or period-direction
Parameters¶
- magstr
Name of the column from which to get stats.
- drrstr
Column name representing the directions. If method is Per Vs Dir or Mag vs Dir
- periodstr
Column name representing the period. If method is Per Vs Dir or Mag vs Per
- args: dict
- Dictionnary with the folowing keys:
- method: str
Name of the method to use, can be: Mag vs Dir: Plot Maginitude Versus Direction Per Vs Dir: Plot Period Versus Direction Mag vs Per: Plot Maginitude Versus Period
- direction binning: str
Can be centered or not-centered depending if the directionnal are centered over 0
- direction interval: int
Dirctionnal interval for the bins in degrees
- folder out: str
Path to save the output
- probablity expressed in: str
This can be percent or per thoushand
- X Min Res Max(optional): list
Minimum, resolution and maximum value of X axis use in the join probability
- Y Min Res Max(optional): list
Minimum, resolution and maximum value of Y axis use in the join probability
- Time blocking: str
- if
Time blocking=='Annual', Statistics will be calculated for the whole timeserie
- if
Time blocking=='Seasonal (South hemisphere)', Statistics will be calculated for each South hemisphere seasons
- if
Time blocking=='Seasonal (North hemisphere)', Statistics will be calculated for each North hemisphere seasons
- if
Time blocking=='Monthly', Statistics will be calculated for each month
- if
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.joint_probability(mag='U',drr='drr',args={'direction interval':45,Time blocking':'Yearly'}) >>>
Outputs:¶
Joint probability example¶ January
0
1
2
3
Total
0
1
2
Total
100
- modal_wave_period(Hs='Hs', Tp='Tp', args={'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'time blocking': {'North hemishere(Summer/Winter)': False, 'North hemisphere 4 seasons': False, 'North hemisphere moosoon(SW,NE,Hot season)': False, 'South hemisphere 4 seasons': False, 'South hemisphere(Summer/Winter)': True}})[source]¶
This function computes the modal period for a set of hs/tp The modal period is taken as the mean period of the top 5% of wave height
Parameters¶
- Hsstr
Name of the column containing significant wave height.
- Tp: str
Name of the column containing the wave period.
- args: dict
- Dictionnary with the folowing keys:
- folder out: str
Path to save the output
- Time blocking: str
- if
Time blocking=='South hemisphere(Summer/Winter)', Statistics will be calculated for South hemisphere summer and winter seasons
- if
Time blocking=='South hemisphere 4 seasons', Statistics will be calculated for each South hemisphere seasons
- if
Time blocking=='North hemishere(Summer/Winter)', Statistics will be calculated for North hemisphere summer and winter seasons
- if
Time blocking=='North hemisphere 4 seasons', Statistics will be calculated for each North hemisphere seasons
- if
Time blocking=='North hemisphere moosoon(SW,NE,Hot season)', Statistics will be calculated for the North hemisphere moonsoon seasons
- if
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.modal_wave_period(Hs='hs',Tp='tp') >>>
Outputs:¶
Modal wave period probability¶ Modal wave period
January
February
- wave_population(Hs='Hs', Tm02='Tm02', Drr_optional='Drr_optional', Tp_optional='Tp_optional', SW_optional='SW', args={'Exposure (years) (= length of time series if not specified)': 0, 'Heigh bin size': 0.5, 'Method': {'Height only': True, 'Height/Direction': False, 'Height/Tp': False, 'Height/period': False}, 'Period bin size': 2, 'direction binning': {'centered': True, 'not-centered': False}, 'direction interval': 45.0, 'directional switch': {'Off': False, 'On': True}, 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs'})[source]¶
This function computes the wave population for fatigue analysis - Based on Rayleigh distribution if spectral width parameter (SW) is not
specified.
Based on Longuet-Higgins Hs-Tp joint probability distribution if SW is specified
Parameters¶
- Hsstr
Name of the column containing significant wave height.
- Tm02: str
Name of the column containing the mean wave period using spectral moments of order 0 and
- Drr_optional: str
Optional column containing the direction
- Tp_optional: str
Optional column containing the wave period
- SW_optional: str
Optional column containing the spectral width parameter
- args: dict
- Dictionnary with the folowing keys:
- Method: str
Name of the method to use, can be: Height only Height/Direction Height/Tp Height/period
- direction binning: str
Can be centered or not-centered depending if the directionnal are centered over 0
- direction interval: int
Dirctionnal interval for the bins in degrees
- Heigh bin size: float
Interval in meter for Hs
- Period bin size’: float
Interval in second for the period
- Exposure (years) (= length of time series if not specified): int
Number of years use, length of time series if not specified
- folder out: str
Path to save the output
- directional switch: str
Can be On or Off to use direction
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.wave_population(data=['hs','tp'],args={'Threshold':[2,15],Time blocking':'Yearly'}) >>>
Outputs:¶
Workability probability¶ Omni
N
S
E
W
> 0.0 <= 0.1
> 0.1 <= 0.2
> 0.2 <= 0.3
Total
- weather_window(data='data', args={'Duration Min Res Max': [6, 6, 72], 'Exceedance bins: Min Res Max(optional)': [2, 1, 22], 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'persistence exceedence': False, 'persistence non-exceedence': True}, 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]¶
This function calculates the averaged number of full windows for data -exceeding specific values during a specific duration (persistence exceedence) -non-exceeding specific values during a specific duration (persistence non-exceedence) Note: if a window overlaps to the next month/season/year, it is assumed to belong to the month/season/year when the window starts.
Parameters¶
- datastr
Name of the column from which to get stats.
- args: dict
- Dictionnary with the folowing keys:
- method: str
It can be persistence exceedence or persistence non-exceedence
- Exceedance bins: Min Res Max(optional): list
Minimum, resolution and maximum value of X axis to use
- Duration Min Res Max: list
Minimum, resolution and maximum duration to use in hours
- folder out: str
Path to save the output
- Time blocking: str
- if
Time blocking=='Annual', Statistics will be calculated for the whole timeserie
- if
Time blocking=='Seasonal (South hemisphere)', Statistics will be calculated for each South hemisphere seasons
- if
Time blocking=='Seasonal (North hemisphere)', Statistics will be calculated for each North hemisphere seasons
- if
Time blocking=='Monthly', Statistics will be calculated for each month
- if
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.weather_window(data='U',args={'time blocking':'Monthly'}) >>>
Outputs:¶
Weather_window example¶ 6
12
18
24
36
>0.2
>0.4
>0.6
- weighted_direction(Hs='Hs', drr='drr', args={'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'time blocking': {'North hemishere(Summer/Winter)': False, 'North hemisphere 4 seasons': False, 'North hemisphere moosoon(SW,NE,Hot season)': False, 'South hemisphere 4 seasons': False, 'South hemisphere(Summer/Winter)': True}})[source]¶
This function computes the energy weighted-dreiction based on input timeseries of Hs and Dir
Parameters¶
- Hsstr
Name of the column containing significant wave height.
- drr: str
Name of the column containing the direction.
- args: dict
- Dictionnary with the folowing keys:
- folder out: str
Path to save the output
- Time blocking: str
- if
Time blocking=='South hemisphere(Summer/Winter)', Statistics will be calculated for South hemisphere summer and winter seasons
- if
Time blocking=='South hemisphere 4 seasons', Statistics will be calculated for each South hemisphere seasons
- if
Time blocking=='North hemishere(Summer/Winter)', Statistics will be calculated for North hemisphere summer and winter seasons
- if
Time blocking=='North hemisphere 4 seasons', Statistics will be calculated for each North hemisphere seasons
- if
Time blocking=='North hemisphere moosoon(SW,NE,Hot season)', Statistics will be calculated for the North hemisphere moonsoon seasons
- if
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.modal_wave_period(Hs='hs',Tp='tp') >>>
Outputs:¶
Workability probability¶ Energy weighted direction
January
February
- workability(variables=['data1'], args={'duration min res max': [6, 6, 72], 'folder out': '/home/docs/checkouts/readthedocs.org/user_builds/totodoc/checkouts/latest/docs', 'method': {'persistence exceedence': True, 'persistence non-exceedence': False}, 'threshold for each dataset': [1, 10], 'time blocking': {'Annual': True, 'Monthly': False, 'Seasonal (North hemisphere)': False, 'Seasonal (South hemisphere)': False}})[source]¶
This function provides workability persistence (non-)exceedence tables, i.e. the % of workable time based on limiting paramters (e.g. Hs < 2m and Wind speed < 10 m/s)
Parameters¶
- variableslist
Name of the column use to create the conditions .
- args: dict
- Dictionnary with the folowing keys:
- method: str
Name of the method to use, can be: persistence exceedence default persistence non-exceedence
- threshold for each dataset: list
list of threshold to use for each of the paramater listed in data. data and Threshold must have the same length
- duration min res max: int
Duration interval in hours
- folder out: str
Path to save the output
- Time blocking: str
- if
Time blocking=='Annual', Statistics will be calculated for the whole timeserie
- if
Time blocking=='Seasonal (South hemisphere)', Statistics will be calculated for each South hemisphere seasons
- if
Time blocking=='Seasonal (North hemisphere)', Statistics will be calculated for each North hemisphere seasons
- if
Time blocking=='Monthly', Statistics will be calculated for each month
- if
Examples:¶
>>> df=tf['test1']['dataframe'].Statistics.workability(data=['hs','tp'],args={'Threshold':[2,15],Time blocking':'Yearly'}) >>>
Outputs:¶
Workability probability¶ >6
>12
>18
>24
>36
January
February