python - Formatting csv to allow numpy to make a data frame -


i'm trying read in csv file numpy. i'm following this tutorial data formatted differently example

here's csv data

and code i'm using:

import datetime dt import pandas pd import numpy np   na_data = np.loadtxt('btc.csv', delimiter=',', skiprows=2) na_price = na_data[:, 3:4] na_dates = np.str_(na_data[:, 0:1])  print na_price print na_dates  valueerror: invalid literal float(): 09/08/2015 

i need format date @ beginning, i've been following other peoples q&a's online , realize need thispd.read_csv('btc.csv', dayfirst=true, parse_dates=[0])but can't figure out how implement it.

thank time

edit: data taken from here, , wrote script split each line. jezrael's comment, printing data frame produces format similar ! maybe maybe can feed text in directly pandas ?

you can use parameter sep arbitary whitespace: \s+ in function read_csv , loc:

import pandas pd import io  temp=u"""date        low     open    close   high    btc_vol  08/08/2015  266     280.04  266.82  280.32  273.43   09/08/2015  260.88  264     265.52  267.6   264.76   10/08/2015  262.17  265.69  265.1   267.72  265.395 """ #after testing replace io.stringio(temp) filename df = pd.read_csv(io.stringio(temp), sep="\s+", parse_dates=[0], dayfirst=true ) print df #        date     low    open   close    high  btc_vol #0 2015-08-08  266.00  280.04  266.82  280.32  273.430 #1 2015-08-09  260.88  264.00  265.52  267.60  264.760 #2 2015-08-10  262.17  265.69  265.10  267.72  265.395  print df.loc[2, 'date'] #2015-08-10 00:00:00  print df.loc[2, 'close'] #265.1 

if want convert pandas dataframe numpy array use values:

print df.values #[[timestamp('2015-08-08 00:00:00') 266.0 280.04 266.82 280.32 273.43] # [timestamp('2015-08-09 00:00:00') 260.88 264.0 265.52 267.6 264.76] # [timestamp('2015-08-10 00:00:00') 262.17 265.69 265.1 267.72 265.395]] 

edit:

you have omit separator, because sep=',' default value (thanks anton):

import pandas pd  df = pd.read_csv('test/btc.csv',parse_dates=[0], dayfirst=true) print df.head()            d     low   open   close     high  unnamed: 5       btc_vol  \ 0 2015-08-08  266.00  280.04  266.82  280.32     273.430  29915.158940    1 2015-08-09  260.88  264.00  265.52  267.60     264.760  16578.024530    2 2015-08-10  262.17  265.69  265.10  267.72     265.395  10780.629240    3 2015-08-11  264.81  265.09  269.57  270.30     267.330   9817.758063    4 2015-08-12  265.80  269.30  269.84  273.75     269.570  14290.615450        usd_vol  unnamed: 8  unnamed: 9   0  8116830           0  281.312854   1  4382630           0  279.808773   2  2856790           0  278.407937   3  2619460           0  277.566229   4  3848950           0  276.830398   

Comments

Popular posts from this blog

sql - VB.NET Operand type clash: date is incompatible with int error -

SVG stroke-linecap doesn't work for circles in Firefox? -

python - TypeError: Scalar value for argument 'color' is not numeric in openCV -