pandas.io.parsers.read_csv¶

pandas.io.parsers.read_csv(filepath_or_buffer, sep=', ', dialect=None, header=0, index_col=None, names=None, skiprows=None, na_values=None, thousands=None, comment=None, parse_dates=False, keep_date_col=False, dayfirst=False, date_parser=None, nrows=None, iterator=False, chunksize=None, skip_footer=0, converters=None, verbose=False, delimiter=None, encoding=None, squeeze=False)¶

Read CSV (comma-separated) file into DataFrame

Also supports optionally iterating or breaking of the file into chunks.

Parameters :

Parameters :	filepath_or_buffer : string or file handle / StringIO. The string could be a URL. Valid URL schemes include http, ftp, and file. For file URLs, a host is expected. For instance, a local file could be file ://localhost/path/to/table.csv sep : string, default ‘,’ Delimiter to use. If sep is None, will try to automatically determine this. Regular expressions are accepted. dialect : string or csv.Dialect instance, default None If None defaults to Excel dialect. Ignored if sep longer than 1 char See csv.Dialect documentation for more details header : int, default 0 Row to use for the column labels of the parsed DataFrame skiprows : list-like or integer Row numbers to skip (0-indexed) or number of rows to skip (int) index_col : int or sequence, default None Column to use as the row labels of the DataFrame. If a sequence is given, a MultiIndex is used. names : array-like List of column names na_values : list-like or dict, default None Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values parse_dates : boolean, list of ints or names, list of lists, or dict True -> try parsing all columns [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column [[1, 3]] -> combine columns 1 and 3 and parse as a single date column {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’ keep_date_col : boolean, default False If True and parse_dates specifies combining multiple columns then keep the original columns. date_parser : function Function to use for converting dates to strings. Defaults to dateutil.parser dayfirst : boolean, default False DD/MM format dates, international and European format thousands : str, default None Thousands separator comment : str, default None Indicates remainder of line should not be parsed Does not support line commenting (will return empty line) nrows : int, default None Number of rows of file to read. Useful for reading pieces of large files iterator : boolean, default False Return TextParser object chunksize : int, default None Return TextParser object for iteration skip_footer : int, default 0 Number of line at bottom of file to skip converters : dict. optional Dict of functions for converting values in certain columns. Keys can either be integers or column labels verbose : boolean, default False Indicate number of NA values placed in non-numeric columns delimiter : string, default None Alternative argument name for sep. Regular expressions are accepted. encoding : string, default None Encoding to use for UTF when reading/writing (ex. ‘utf-8’) squeeze : boolean, default False If the parsed data only contains one column then return a Series
Returns :	result : DataFrame or TextParser

filepath_or_buffer : string or file handle / StringIO. The string could be

a URL. Valid URL schemes include http, ftp, and file. For file URLs, a host is expected. For instance, a local file could be file ://localhost/path/to/table.csv

sep : string, default ‘,’

Delimiter to use. If sep is None, will try to automatically determine this. Regular expressions are accepted.

dialect : string or csv.Dialect instance, default None

If None defaults to Excel dialect. Ignored if sep longer than 1 char See csv.Dialect documentation for more details

header : int, default 0

Row to use for the column labels of the parsed DataFrame

skiprows : list-like or integer

Row numbers to skip (0-indexed) or number of rows to skip (int)

index_col : int or sequence, default None

Column to use as the row labels of the DataFrame. If a sequence is given, a MultiIndex is used.

names : array-like

List of column names

na_values : list-like or dict, default None

Additional strings to recognize as NA/NaN. If dict passed, specific per-column NA values

parse_dates : boolean, list of ints or names, list of lists, or dict

True -> try parsing all columns [1, 2, 3] -> try parsing columns 1, 2, 3 each as a separate date column [[1, 3]] -> combine columns 1 and 3 and parse as a single date column {‘foo’ : [1, 3]} -> parse columns 1, 3 as date and call result ‘foo’

keep_date_col : boolean, default False

If True and parse_dates specifies combining multiple columns then keep the original columns.

date_parser : function

Function to use for converting dates to strings. Defaults to dateutil.parser

dayfirst : boolean, default False

DD/MM format dates, international and European format

thousands : str, default None

Thousands separator

comment : str, default None

Indicates remainder of line should not be parsed Does not support line commenting (will return empty line)

nrows : int, default None

Number of rows of file to read. Useful for reading pieces of large files

iterator : boolean, default False

Return TextParser object

chunksize : int, default None

Return TextParser object for iteration

skip_footer : int, default 0

Number of line at bottom of file to skip

converters : dict. optional

Dict of functions for converting values in certain columns. Keys can either be integers or column labels

verbose : boolean, default False

Indicate number of NA values placed in non-numeric columns

delimiter : string, default None

Alternative argument name for sep. Regular expressions are accepted.

encoding : string, default None

Encoding to use for UTF when reading/writing (ex. ‘utf-8’)

squeeze : boolean, default False

If the parsed data only contains one column then return a Series

Returns :

result : DataFrame or TextParser

pandas 0.8.1 documentation

Table Of Contents

Search

pandas.io.parsers.read_csv¶