Input¶
Functions to read time series from file into
totoframe.
The input functions allow abstracting away the format the data are stored on disk and loading them into a standard Panda DataFrame object. The methods adds attribute to the dataframe such as unit, latitude,longitude.
Reading functions are defined in modules within
toto.input subpackage. The functions can be accessed as:
from toto.inputs.nc import NCfile
dset = NCfile('myfile.nc')_toDataFrame()
The following convention is expected for defining reading functions:
Funcions for different file types are defined in different modules within
toto.inputsubpackage.Modules are named as filetype.py, e.g.,
nc.py.Classes are named as filetype`file, e.g., ``NCfile`.
Each class must have a _toDataFrame() function
The following input functions are currently available:
Generic NetCDF:¶
Read generic netcdf file This import function works well is NetCDF or Zarr files created by XARRAY. This class returns a Panda Dataframe with some extra attributes such as Latitude,Longitude,Units.
Parameters¶
- filename(files,) str or list_like
A list of filename to process.
Examples¶
>>> from toto.inputs.nc import NCfile
>>> nc=NCfile('filename.nc')._toDataFrame()
MSL NetCDF:¶
Read MSL netcdf file This import function works with NetCDF files created by MetOcean Solution Ltd. This NetCDF file have been extracted by the UDS. This class returns a Panda Dataframe with some extra attributes such as Latitude,Longitude,Units.
Parameters¶
- filename(files,) str or list_like
A list of filename to process.
Examples¶
>>> from toto.inputs.msl import MSLfile
>>> nc=MSLfile('filename.nc')._toDataFrame()
LINZ NetCDF:¶
Read LINZ netcdf file This import function works with NetCDF files created from tidal gauge from LINZ. It reads both sensors as welll as the README file which should be in the same directory. This class returns a Panda Dataframe with some extra attributes such as Latitude,Longitude,Units.
Parameters¶
- filename(files,) str or list_like
A list of filename to process. This can be either a NetCDF file made by linz.downdload or a csv file directly downloaded from Linz website
Examples¶
>>> from toto.inputs.linz import LINZfile
>>> nc=LINZfile('filename.nc')._toDataFrame()
MOET NetCDF:¶
Read MOET netcdf file This import function works with NetCDF files created by MetOcean Solution Ltd. This NetCDF file have a special format to be read by the MOET software. This class returns a Panda Dataframe with some extra attributes such as Latitude,Longitude,Units.
Parameters¶
- filename(files,) str or list_like
A list of filename to process.
Examples¶
>>> from toto.inputs.moet import MOETfile
>>> nc=MOETfile('filename.nc')._toDataFrame()
MATLAB¶
Read MATLAB file This import mat file. This class returns a Panda Dataframe with some extra attributes such as Latitude,Longitude,Units.
Parameters¶
- filename(files,) str or list_like
A list of filename to process.
Notes¶
The file MUST contain a variable called time, t or timestamp with matlab datenum time steps
Examples¶
>>> from toto.inputs.mat import MATfile
>>> nc=MATfile('filename.mat')._toDataFrame()
TRYAXIS¶
Read TRYAXIS file This import raw file for a TRYAXIS wave Buoy. This class returns a Panda Dataframe with some extra attributes such as Latitude,Longitude,Units.
Parameters¶
- filename(files,) str or list_like
A list of filename to process.
Notes¶
The function only works with the NONDIRSPEC and DIRSPEC files
Examples¶
>>> from toto.inputs.tryaxis import TRYAXISfile
>>> nc=TRYAXISfile('filename.NONDIRSPEC')._toDataFrame()
TEXT¶
Read txt,csv file This import text file. The function uses the read_csv function from panda <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html>_. This class returns a Panda Dataframe with some extra attributes such as Latitude,Longitude,Units.
Parameters¶
- filename(files,) str or list_like
A list of filename to process.
- sepstr, default {_default_sep}
Delimiter to use. If sep is None, the C engine cannot automatically detect the separator, but the Python parsing engine can, meaning the latter will be used and automatically detect the separator by Python’s builtin sniffer tool,
csv.Sniffer. In addition, separators longer than 1 character and different from'\s+'will be interpreted as regular expressions and will also force the use of the Python parsing engine. Note that regex delimiters are prone to ignoring quoted data. Regex example:' '.- skiprowslist-like, int or callable, optional
Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file. If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be
lambda x: x in [0, 2].- skipfooterint, default 0
Number of lines at bottom of file to skip (Unsupported with engine=’c’).
- miss_valscalar, str, list-like, or dict, optional
Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values.
- colNamesLineint, default 1
Line number where the header are defined
- unitNamesLineint, default 1
Line number where the units are defined
- single_columnbool, default False
The time is represented in a single column
- customUnitstr, default ‘%d-%m-%Y %H:%M:%S’
String reprensenting the time format
- unitstr default ‘s’, can be ‘auto’,’custom’,’matlab’ or ‘s’ and ‘D’
unit of the single column time. Only matter if single_column is True
- time_col_name: dict, default {‘Year’:’year’,’Month’:’month’,’Day’:’day’,’Hour’:’hour’,’Min’:’Minute’,’Sec’:’Second’}
Dictonary for renaming the each column, so Panda can interprate the time. Only matter if single_column is False
- colNamesList, default []
List of column names to use.
- unitNamesList, default []
List of unit to use.
Notes¶
Whe openning the TOTOVIEW gui this function will be called with totoview.inputs.txtGUI
Examples¶
>>> from toto.inputs.txt import TXTfile
>>> tx=TXTfile([filename],colNamesLine=1,miss_val='NaN', sep=',',skiprows=1,unit='custom',time_col_name='time',unitNamesLine=0, single_column=True,customUnit='%d/%m/%Y %H:%M')
>>> tx.reads()
>>> tx.read_time()
>>> df=tx._toDataFrame()
CONSTITUENTS FILE¶
Read constituens file This import file containing amplitude and phase for each tidal constituents. The function uses the read_csv function from panda <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html>_ to read three columns:
Constituents name
Constituents phase
Constituents amplitudes
This class returns a Panda Dataframe with some extra attributes such as Latitude,Longitude,Units.
This uses the module Utide. <https://github.com/wesleybowman/UTide>_
Parameters¶
- filename(files,) str or list_like
A list of filename to process.
- sepstr, default {_default_sep}
Delimiter to use. If sep is None, the C engine cannot automatically detect the separator, but the Python parsing engine can, meaning the latter will be used and automatically detect the separator by Python’s builtin sniffer tool,
csv.Sniffer. In addition, separators longer than 1 character and different from'\s+'will be interpreted as regular expressions and will also force the use of the Python parsing engine. Note that regex delimiters are prone to ignoring quoted data. Regex example:' '.- skiprowslist-like, int or callable, optional
Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file. If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be
lambda x: x in [0, 2].- skipfooterint, default 0
Number of lines at bottom of file to skip (Unsupported with engine=’c’).
- colNamesLineint, default 1
Line number where the header are defined
- unitstr default ‘degrees’, can be ‘radians’
unit of the phases
- min_datedatetime, default datetime.datetime(2020,1,1)
Start time of the timeseries
- max_datedatetime, default datetime.datetime(2020,1,1)
End time of the timeseries
- dtint, default 3600
Time step in seconds to use when creating the timeserie
- latitudeint, default -40
Latitude use to calculate the timeserie
Notes¶
Whe openning the TOTOVIEW gui this function will be called with totoview.inputs.consGUI
Examples¶
>>> from toto.inputs.cons import CONSfile
>>> nc=CONSfile(['cons_list.csv'],sep=',',
colNames=[],
unit='degrees',
miss_val='NaN',
colNamesLine=1,
skiprows=1,
skipfooter=0,
col_name={'cons':'Cons','amp':'Amplitude','pha':'Phase'}, )
>>> nc.reads()
>>> nc.read_cons()
>>> df=nc._toDataFrame()
EXCEL FILE¶
Read xls,xlsx file This import Excel type file. The function uses the read_excel function from panda <https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_excel.html>_. This class returns a Panda Dataframe with some extra attributes such as Latitude,Longitude,Units.
Parameters¶
- filename(files,) str or list_like
A list of filename to process.
- sheet_namestr, int, list, or None, default 0
Strings are used for sheet names. Integers are used in zero-indexed sheet positions. Lists of strings/integers are used to request multiple sheets. Specify None to get all sheets.
Available cases:
Defaults to
0: 1st sheet as a DataFrame1: 2nd sheet as a DataFrame"Sheet1": Load sheet with name “Sheet1”[0, 1, "Sheet5"]: Load first, second and sheet named “Sheet5” as a dict of DataFrameNone: All sheets.
- colNamesList, default []
List of column names to use.
- unitNamesList, default []
List of unit to use.
- miss_valscalar, str, list-like, or dict, optional
Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values.
- colNamesLineint, default 1
Line number where the header are defined
- skiprowslist-like, int or callable, optional
Line numbers to skip (0-indexed) or number of lines to skip (int) at the start of the file. If callable, the callable function will be evaluated against the row indices, returning True if the row should be skipped and False otherwise. An example of a valid callable argument would be
lambda x: x in [0, 2].- skipfooterint, default 0
Number of lines at bottom of file to skip (Unsupported with engine=’c’).
- unitNamesLineint, default 1
Line number where the units are defined
- single_columnbool, default False
The time is represented in a single column
- customUnitstr, default ‘%d-%m-%Y %H:%M:%S’
String reprensenting the time format
- unitstr default ‘s’, can be ‘auto’,’custom’,’matlab’ or ‘s’ and ‘D’
unit of the single column time. Only matter if single_column is True
- time_col_name: dict, default {‘Year’:’year’,’Month’:’month’,’Day’:’day’,’Hour’:’hour’,’Min’:’Minute’,’Sec’:’Second’}
Dictonary for renaming the each column, so Panda can interprate the time. Only matter if single_column is False
Examples¶
>>> from toto.inputs.xls import XLSfile
>>> tx=XLSfile([filename],sheetnames='test3', colNames= [], unitNames= [],miss_val='NaN', colNamesLine= 1, skiprows= 2, unitNamesLine= 0, skipfooter= 0, single_column= True, unit= 's', customUnit= '%d-%m-%Y %H:%M:%S', time_col_name= {})
>>> tx.reads()
>>> tx.read_time()
>>> df=tx._toDataFrame()
RSK FILE¶
Read RSK file from RBR Ltd This import raw file for a RBR pressure sensor. This class returns a Panda Dataframe.
Parameters¶
- filename(files,) str or list_like
A list of filename to process.
Examples¶
>>> from toto.inputs.rsk import RSKfile
>>> nc=RSKfile('filename.rsk')._toDataFrame()