Note
:
A blank line in a CSV file will be returned as an array
comprising a single
null
field, and will not be treated
as an error.
Note
:
If PHP is not properly recognizing
the line endings when reading files either on or created by a Macintosh
computer, enabling the
auto_detect_line_endings
run-time configuration option may help resolve the problem.
Example #1 Read and print the entire contents of a CSV file
<?php
$row
=
1
;
if ((
$handle
=
fopen
(
"test.csv"
,
"r"
)) !==
FALSE
) {
while ((
$data
=
fgetcsv
(
$handle
,
1000
,
","
)) !==
FALSE
) {
$num
=
count
(
$data
);
echo
"<p>
$num
fields in line
$row
: <br /></p>\n"
;
$row
++;
for (
$c
=
0
;
$c
<
$num
;
$c
++) {
echo
$data
[
$c
] .
"<br />\n"
;
}
}
fclose
(
$handle
);
}
?>
str_getcsv()
- Parse a CSV string into an array
explode()
- Split a string by a string
file()
- Reads entire file into an array
pack()
- Pack data into binary string
fputcsv()
- Format line as CSV and write to file pointer
james dot ellis at gmail dot com
¶
15 years ago
If you need to set auto_detect_line_endings to deal with Mac line endings, it may seem obvious but remember it should be set before fopen, not after:
This will work:
<?php
ini_set
(
'auto_detect_line_endings'
,
TRUE
);
$handle
=
fopen
(
'/path/to/file'
,
'r'
);
while ( (
$data
=
fgetcsv
(
$handle
) ) !==
FALSE
) {
}
ini_set
(
'auto_detect_line_endings'
,
FALSE
);
?>
This won't, you will still get concatenated fields at the new line position:
<?php
$handle
=
fopen
(
'/path/to/file'
,
'r'
);
ini_set
(
'auto_detect_line_endings'
,
TRUE
);
while ( (
$data
=
fgetcsv
(
$handle
) ) !==
FALSE
) {
}
ini_set
(
'auto_detect_line_endings'
,
FALSE
);
?>
shaun at slickdesign dot com dot au
¶
6 years ago
When a BOM character is suppled, `fgetscsv` may appear to wrap the first element in "double quotation marks". The simplest way to ignore it is to progress the file pointer to the 4th byte before using `fgetcsv`.
<?php
$bom
=
"\xef\xbb\xbf"
;
$fp
=
fopen
(
$path
,
'r'
);
if (
fgets
(
$fp
,
4
) !==
$bom
) {
rewind
(
$fp
);
}
$lines
= array();
while(!
feof
(
$fp
) && (
$line
=
fgetcsv
(
$fp
)) !==
false
) {
$lines
[] =
$line
;
}
?>
michael dot arnauts at gmail dot com
¶
12 years ago
fgetcsv seems to handle newlines within fields fine. So in fact it is not reading a line, but keeps reading untill it finds a \n-character that's not quoted as a field.
Example:
<?php
$handle
=
fopen
(
"test.csv"
,
"r"
);
while ((
$data
=
fgetcsv
(
$handle
)) !==
FALSE
) {
var_dump
(
$data
);
}
?>
Returns:
array(3) {
[0]=>
string(5) "col 1"
[1]=>
string(4) "col2"
[2]=>
string(4) "col3"
}
array(3) {
[0]=>
string(29) "this
is
having
multiple
lines"
[1]=>
string(8) "this not"
[2]=>
string(13) "this also not"
}
array(3) {
[0]=>
string(13) "normal record"
[1]=>
string(19) "nothing to see here"
[2]=>
string(7) "no data"
}
This means that you can expect fgetcsv to handle newlines within fields fine. This was not clear from the documentation.
Sbastien
¶
3 years ago
To use fgetcsv() with a string instead of a file, you can use the data: wrapper
https://www.php.net/wrappers.data
:
<?php
$csv
= <<<CSV
v1.1,v1.2,v1.3
v2.1,v2.2,v2.3
CSV;
$fp
=
fopen
(
'data://text/plain,'
.
$csv
,
'r'
);
print_r
(
fgetcsv
(
$fp
));
print_r
(
fgetcsv
(
$fp
));
?>
myrddin at myrddin dot myrddin
¶
17 years ago
Here is a OOP based importer similar to the one posted earlier. However, this is slightly more flexible in that you can import huge files without running out of memory, you just have to use a limit on the get() method
Sample usage for small files:-
-------------------------------------
<?php
$importer
= new
CsvImporter
(
"small.txt"
,
true
);
$data
=
$importer
->
get
();
print_r
(
$data
);
?>
Sample usage for large files:-
-------------------------------------
<?php
$importer
= new
CsvImporter
(
"large.txt"
,
true
);
while(
$data
=
$importer
->
get
(
2000
))
{
print_r
(
$data
);
}
?>
And heres the class:-
-------------------------------------
<?php
class
CsvImporter
{
private
$fp
;
private
$parse_header
;
private
$header
;
private
$delimiter
;
private
$length
;
function
__construct
(
$file_name
,
$parse_header
=
false
,
$delimiter
=
"\t"
,
$length
=
8000
)
{
$this
->
fp
=
fopen
(
$file_name
,
"r"
);
$this
->
parse_header
=
$parse_header
;
$this
->
delimiter
=
$delimiter
;
$this
->
length
=
$length
;
$this
->
lines
=
$lines
;
if (
$this
->
parse_header
)
{
$this
->
header
=
fgetcsv
(
$this
->
fp
,
$this
->
length
,
$this
->
delimiter
);
}
}
function
__destruct
()
{
if (
$this
->
fp
)
{
fclose
(
$this
->
fp
);
}
}
function
get
(
$max_lines
=
0
)
{
$data
= array();
if (
$max_lines
>
0
)
$line_count
=
0
;
else
$line_count
= -
1
;
while (
$line_count
<
$max_lines
&& (
$row
=
fgetcsv
(
$this
->
fp
,
$this
->
length
,
$this
->
delimiter
)) !==
FALSE
)
{
if (
$this
->
parse_header
)
{
foreach (
$this
->
header
as
$i
=>
$heading_i
)
{
$row_new
[
$heading_i
] =
$row
[
$i
];
}
$data
[] =
$row_new
;
}
else
{
$data
[] =
$row
;
}
if (
$max_lines
>
0
)
$line_count
++;
}
return
$data
;
}
}
?>
Tim Henderson
¶
16 years ago
Only problem with fgetcsv(), at least in PHP 4.x -- any stray slash in the data that happens to come before a double-quote delimiter will break it -- ie, cause the field delimiter to be escaped. I can't find a direct way to deal with it, since fgetcsv() doesn't give you a chance to manipulate the line before it reads it and parses it...I've had to change all occurrences of '\"' to '" in the file first before feeding ot to fgetcsv(). Otherwise this is perfect for that Microsoft-CSV formula, deals gracefully with all the issues.
jc at goetc dot net
¶
19 years ago
I've had alot of projects recently dealing with csv files, so I created the following class to read a csv file and return an array of arrays with the column names as keys. The only requirement is that the 1st row contain the column headings.
I only wrote it today, so I'll probably expand on it in the near future.
<?php
class
CSVparse
{
var
$mappings
= array();
function
parse_file
(
$filename
)
{
$id
=
fopen
(
$filename
,
"r"
);
$data
=
fgetcsv
(
$id
,
filesize
(
$filename
));
if(!
$this
->
mappings
)
$this
->
mappings
=
$data
;
while(
$data
=
fgetcsv
(
$id
,
filesize
(
$filename
)))
{
if(
$data
[
0
])
{
foreach(
$data
as
$key
=>
$value
)
$converted_data
[
$this
->
mappings
[
$key
]] =
addslashes
(
$value
);
$table
[] =
$converted_data
;
}
}
fclose
(
$id
);
return
$table
;
}
}
?>
michael dot martinek at gmail dot com
¶
15 years ago
Here's something I put together this morning. It allows you to read rows from your CSV and get values based on the name of the column. This works great when your header columns are not always in the same order; like when you're processing many feeds from different customers. Also makes for cleaner, easier to manage code.
So if your feed looks like this:
product_id,category_name,price,brand_name, sku_isbn_upc,image_url,landing_url,title,description
123,Test Category,12.50,No Brand,0,
http://www.example.com,
http://www.example.com/landing.php,
Some Title,Some Description
You can do:
<?php
while (
$o
->
getNext
())
{
$dPrice
=
$o
->
getPrice
();
$nProductID
=
$o
->
getProductID
();
$sBrandName
=
$o
->
getBrandName
();
}
?>
If you have any questions or comments regarding this class, they can be directed to [email protected] as I probably won't be checking back here.
<?php
define
(
'C_PPCSV_HEADER_RAW'
,
0
);
define
(
'C_PPCSV_HEADER_NICE'
,
1
);
class
PaperPear_CSVParser
{
private
$m_saHeader
= array();
private
$m_sFileName
=
''
;
private
$m_fp
=
false
;
private
$m_naHeaderMap
= array();
private
$m_saValues
= array();
function
__construct
(
$sFileName
)
{
if (
$this
->
m_fp
=
fopen
(
$sFileName
,
'r'
))
{
$this
->
processHeader
();
}
}
function
__call
(
$sMethodName
,
$saArgs
)
{
if (
preg_match
(
"/[sg]et(.*)/"
,
$sMethodName
,
$saFound
))
{
$sName
=
strtoupper
(
$saFound
[
1
]);
if (
array_key_exists
(
$sName
,
$this
->
m_naHeaderMap
))
{
$nIndex
=
$this
->
m_naHeaderMap
[
$sName
];
if (
$sMethodName
{
0
} ==
'g'
)
{
return
$this
->
m_saValues
[
$nIndex
];
}
else
{
$this
->
m_saValues
[
$nIndex
] =
$saArgs
[
0
];
return
true
;
}
}
}
return
false
;
}
public static function
GetNiceHeaderName
(
$sName
)
{
return
strtoupper
(
preg_replace
(
'/[^A-Za-z0-9]/'
,
''
,
$sName
));
}
private function
processHeader
()
{
$sLine
=
fgets
(
$this
->
m_fp
);
$saFields
=
split
(
","
,
$sLine
);
$nIndex
=
0
;
foreach (
$saFields
as
$sField
)
{
$sField
=
trim
(
$sField
);
$sNiceName
=
PaperPear_CSVParser
::
GetNiceHeaderName
(
$sField
);
$this
->
m_saHeader
[
$nIndex
] = array(
C_PPCSV_HEADER_RAW
=>
$sField
,
C_PPCSV_HEADER_NICE
=>
$sNiceName
);
$this
->
m_naHeaderMap
[
$sNiceName
] =
$nIndex
;
$nIndex
++;
}
}
public function
getNext
()
{
if ((
$saValues
=
fgetcsv
(
$this
->
m_fp
)) !==
false
)
{
$this
->
m_saValues
=
$saValues
;
return
true
;
}
return
false
;
}
}
$o
= new
PaperPear_CSVParser
(
'F:\foo.csv'
);
while (
$o
->
getNext
())
{
echo
"Price="
.
$o
->
getPrice
() .
"\r\n"
;
}
?>
tomasz at marcinkowski dot pl
¶
10 years ago
For anyone else struggling with disappearing non-latin characters in one-byte encodings - setting LANG env var (as the manual states) does not help at all. Look at LC_ALL instead.
In my case it was set to "pl_PL.utf8" but since my input file was in CP1250 most of polish characters (but not all of them!) had gone missing and city of "Łódź" had become just "dź". I've "fixed" it with "pl_PL".
kent at marketruler dot com
¶
14 years ago
Note that fgetcsv, at least in PHP 5.3 or previous, will NOT work with UTF-16 encoded files. Your options are to convert the entire file to ISO-8859-1 (or latin1), or convert line by line and convert each line into ISO-8859-1 encoding, then use str_getcsv (or compatible backwards-compatible implementation). If you need to read non-latin alphabets, probably best to convert to UTF-8.
See str_getcsv for a backwards-compatible version of it with PHP < 5.3, and see utf8_decode for a function written by Rasmus Andersson which provides utf16_decode. The modification I added was that the BOP appears at the top of the file, then not on subsequent lines. So you need to store the endian-ness, and then re-send it upon each subsequent line decoding. This modified version returns the endianness, if it's not available:
<?php
function
utf16_decode
(
$str
, &
$be
=
null
) {
if (
strlen
(
$str
) <
2
) {
return
$str
;
}
$c0
=
ord
(
$str
{
0
});
$c1
=
ord
(
$str
{
1
});
$start
=
0
;
if (
$c0
==
0xFE
&&
$c1
==
0xFF
) {
$be
=
true
;
$start
=
2
;
} else if (
$c0
==
0xFF
&&
$c1
==
0xFE
) {
$start
=
2
;
$be
=
false
;
}
if (
$be
===
null
) {
$be
=
true
;
}
$len
=
strlen
(
$str
);
$newstr
=
''
;
for (
$i
=
$start
;
$i
<
$len
;
$i
+=
2
) {
if (
$be
) {
$val
=
ord
(
$str
{
$i
}) <<
4
;
$val
+=
ord
(
$str
{
$i
+
1
});
} else {
$val
=
ord
(
$str
{
$i
+
1
}) <<
4
;
$val
+=
ord
(
$str
{
$i
});
}
$newstr
.= (
$val
==
0x228
) ?
"\n"
:
chr
(
$val
);
}
return
$newstr
;
}
?>
Trying the "setlocale" trick did not work for me, e.g.
<?php
setlocale
(
LC_CTYPE
,
"en.UTF16"
);
$line
=
fgetcsv
(
$file
, ...)
?>
But that's perhaps because my platform didn't support it. However, fgetcsv only supports single characters for the delimiter, etc. and complains if you pass in a UTF-16 version of said character, so I gave up on that rather quickly.
Hope this is helpful to someone out there.
junk at vhd dot com dot au
¶
18 years ago
The fgetcsv function seems to follow the MS excel conventions, which means:
- The quoting character is escaped by itself and not the back slash.
(i.e.Let's use the double quote (") as the quoting character:
Two double quotes "" will give a single " once parsed, if they are inside a quoted field (otherwise neither of them will be removed).
\" will give \" whether it is in a quoted field or not (same for \\) , and
if a single double quote is inside a quoted field it will be removed. If it is not inside a quoted field it will stay).
- leading and trailing spaces (\s or \t) are never removed, regardless of whether they are in quoted fields or not.
- Line breaks within fields are dealt with correctly if they are in quoted fields. (So previous comments stating the opposite are wrong, unless they are using a different PHP version.... I am using 4.4.0.)
So fgetcsv if actually very complete and can deal with every possible situation. (It does need help for macintosh line breaks though, as mentioned in the help files.)
I wish I knew all this from the start. From my own benchmarks fgetcsv strikes a very good compromise between memory consumption and speed.
-------------------------
Note: If back slashes are used to escape quotes they can easily be removed afterwards. Same for leading and trailing spaces.
code at ashleyhunt dot co dot uk
¶
13 years ago
I needed a function to analyse a file for delimiters and line endings prior to importing the file into MySQL using LOAD DATA LOCAL INFILE
I wrote this function to do the job, the results are (mostly) very accurate and it works nicely with large files too.
<?php
function
analyse_file
(
$file
,
$capture_limit_in_kb
=
10
) {
$output
[
'peak_mem'
][
'start'
] =
memory_get_peak_usage
(
true
);
$output
[
'read_kb'
] =
$capture_limit_in_kb
;
$fh
=
fopen
(
$file
,
'r'
);
$contents
=
fread
(
$fh
, (
$capture_limit_in_kb
*
1024
));
fclose
(
$fh
);
$delimiters
= array(
'comma'
=>
','
,
'semicolon'
=>
';'
,
'tab'
=>
"\t"
,
'pipe'
=>
'|'
,
'colon'
=>
':'
);
$line_endings
= array(
'rn'
=>
"\r\n"
,
'n'
=>
"\n"
,
'r'
=>
"\r"
,
'nr'
=>
"\n\r"
);
foreach (
$line_endings
as
$key
=>
$value
) {
$line_result
[
$key
] =
substr_count
(
$contents
,
$value
);
}
asort
(
$line_result
);
$output
[
'line_ending'
][
'results'
] =
$line_result
;
$output
[
'line_ending'
][
'count'
] =
end
(
$line_result
);
$output
[
'line_ending'
][
'key'
] =
key
(
$line_result
);
$output
[
'line_ending'
][
'value'
] =
$line_endings
[
$output
[
'line_ending'
][
'key'
]];
$lines
=
explode
(
$output
[
'line_ending'
][
'value'
],
$contents
);
array_pop
(
$lines
);
$complete_lines
=
implode
(
' '
,
$lines
);
$output
[
'lines'
][
'count'
] =
count
(
$lines
);
$output
[
'lines'
][
'length'
] =
strlen
(
$complete_lines
);
foreach (
$delimiters
as
$delimiter_key
=>
$delimiter
) {
$delimiter_result
[
$delimiter_key
] =
substr_count
(
$complete_lines
,
$delimiter
);
}
asort
(
$delimiter_result
);
$output
[
'delimiter'
][
'results'
] =
$delimiter_result
;
$output
[
'delimiter'
][
'count'
] =
end
(
$delimiter_result
);
$output
[
'delimiter'
][
'key'
] =
key
(
$delimiter_result
);
$output
[
'delimiter'
][
'value'
] =
$delimiters
[
$output
[
'delimiter'
][
'key'
]];
$output
[
'peak_mem'
][
'end'
] =
memory_get_peak_usage
(
true
);
return
$output
;
}
?>
Example Usage:
<?php
$Array
=
analyse_file
(
'/www/files/file.csv'
,
10
);
?>
Full function output:
Array
(
[peak_mem] => Array
(
[start] => 786432
[end] => 786432
)
[line_ending] => Array
(
[results] => Array
(
[nr] => 0
[r] => 4
[n] => 4
[rn] => 4
)
[count] => 4
[key] => rn
[value] =>
)
[lines] => Array
(
[count] => 4
[length] => 94
)
[delimiter] => Array
(
[results] => Array
(
[colon] => 0
[semicolon] => 0
[pipe] => 0
[tab] => 1
[comma] => 17
)
[count] => 17
[key] => comma
[value] => ,
)
[read_kb] => 10
)
Enjoy!
Ashley
jonathangrice at yahoo dot com
¶
13 years ago
This is how to read a csv file into a multidimensional array.
<?php
if ((
$handle
=
fopen
(
"file.csv"
,
"r"
)) !==
FALSE
) {
$nn
=
0
;
while ((
$data
=
fgetcsv
(
$handle
,
1000
,
","
)) !==
FALSE
) {
$c
=
count
(
$data
);
for (
$x
=
0
;
$x
<
$c
;
$x
++)
{
$csvarray
[
$nn
][
$x
] =
$data
[
$x
];
}
$nn
++;
}
fclose
(
$handle
);
}
print_r
(
$csvarray
);
?>
phpnet at smallfryhosting dot co dot uk
¶
20 years ago
Another version [modified michael from mediaconcepts]
<?php
function
arrayFromCSV
(
$file
,
$hasFieldNames
=
false
,
$delimiter
=
','
,
$enclosure
=
''
) {
$result
= Array();
$size
=
filesize
(
$file
) +
1
;
$file
=
fopen
(
$file
,
'r'
);
if (
$hasFieldNames
)
$keys
=
fgetcsv
(
$file
,
$size
,
$delimiter
,
$enclosure
);
while (
$row
=
fgetcsv
(
$file
,
$size
,
$delimiter
,
$enclosure
)) {
$n
=
count
(
$row
);
$res
=array();
for(
$i
=
0
;
$i
<
$n
;
$i
++) {
$idx
= (
$hasFieldNames
) ?
$keys
[
$i
] :
$i
;
$res
[
$idx
] =
$row
[
i
];
}
$result
[] =
$res
;
}
fclose
(
$file
);
return
$result
;
}
?>
matthias dot isler at gmail dot com
¶
14 years ago
If you want to load some translations for your application, don't use csv files for that, even if it's easier to handle.
The following code snippet:
<?php
$lang
= array();
$handle
=
fopen
(
'en.csv'
,
'r'
);
while(
$row
=
fgetcsv
(
$handle
,
500
,
';'
))
{
$lang
[
$row
[
0
]] =
$row
[
1
];
}
fclose
(
$handle
);
?>
is about 400% slower than this code:
<?php
$lang
= array();
$values
=
parse_ini_file
(
'de.ini'
);
foreach(
$values
as
$key
=>
$val
)
{
$lang
[
$key
] =
$val
;
}
?>
That's the reason why you should allways use .ini files for translations...
http://php.net/parse_ini_file
matasbi at gmail dot com
¶
13 years ago
Parse from Microsoft Excel "Unicode Text (*.txt)" format:
<?php
function
parse
(
$file
) {
if ((
$handle
=
fopen
(
$file
,
"r"
)) ===
FALSE
) return;
while ((
$cols
=
fgetcsv
(
$handle
,
1000
,
"\t"
)) !==
FALSE
) {
foreach(
$cols
as
$key
=>
$val
) {
$cols
[
$key
] =
trim
(
$cols
[
$key
] );
$cols
[
$key
] =
iconv
(
'UCS-2'
,
'UTF-8'
,
$cols
[
$key
].
"\0"
) ;
$cols
[
$key
] =
str_replace
(
'""'
,
'"'
,
$cols
[
$key
]);
$cols
[
$key
] =
preg_replace
(
"/^\"(.*)\"$/sim"
,
"$1"
,
$cols
[
$key
]);
}
echo
print_r
(
$cols
,
1
);
}
}
?>
daniel at softel dot jp
¶
18 years ago
Note that fgetcsv() uses the system locale setting to make assumptions about character encoding.
So if you are trying to process a UTF-8 CSV file on an EUC-JP server (for example),
you will need to do something like this before you call fgetcsv():
setlocale(LC_ALL, 'ja_JP.UTF8');
[Also not that setlocale() doesn't *permanently* affect the system locale setting]
from_php at puggan dot se
¶
7 years ago
Setting the $escape parameter dosn't return unescaped strings, but just avoid splitting on a $delimiter that have an escpae-char infront of it:
<?php
$tmp_file
=
"/tmp/test.csv"
;
file_put_contents
(
$tmp_file
,
"\"first\\\";\\\"secound\""
);
echo
"raw:"
.
PHP_EOL
.
file_get_contents
(
$tmp_file
) .
PHP_EOL
.
PHP_EOL
;
echo
"fgetcsv escaped bs:"
.
PHP_EOL
;
$f
=
fopen
(
$tmp_file
,
'r'
);
while(
$r
=
fgetcsv
(
$f
,
1024
,
';'
,
'"'
,
"\\"
))
{
print_r
(
$r
);
}
fclose
(
$f
);
echo
PHP_EOL
;
echo
"fgetcsv escaped #:"
.
PHP_EOL
;
$f
=
fopen
(
$tmp_file
,
'r'
);
while(
$r
=
fgetcsv
(
$f
,
1024
,
';'
,
'"'
,
"#"
))
{
print_r
(
$r
);
}
fclose
(
$f
);
echo
PHP_EOL
;
?>
ifedinachukwu at yahoo dot com
¶
13 years ago
I had a csv file whose fields included data with line endings (CRLF created by hitting the carriage returns in html textarea). Of course, the LF in these fields was escaped by MySQL during the creation of the csv. Problem is I could NOT get fgetcsv to work correctly here, since each and every LF was regarded as the end of a line of the csv file, even when it was escaped!
Since what I wanted was to get THE FIRST LINE of the csv file, then count the number of fields by exploding on all unescaped commas, I had to resort to this:
<?php
$fp
=
fopen
(
'file.csv'
,
'r'
);
$i
=
1
;
$str
=
''
;
$srch
=
''
;
while (
false
!== (
$char
=
fgetc
(
$fp
))) {
$str
.=
$char
;
$srch
.=
$char
;
if(
strlen
(
$srch
) >
2
){
$srch
=
substr
(
$srch
,
1
);
}
if(
$i
>
1
&&
$srch
[
1
] ==
chr
(
10
) &&
$srch
[
0
] !=
'\\'
){
break;
}
$i
++;
}
echo
$str
;
?>
Perhaps there exists a more elegant solution to this issue, in which case I'd be glad to know!
jaimthorn at yahoo dot com
¶
14 years ago
I used fgetcsv to read pipe-delimited data files, and ran into the following quirk.
The data file contained data similar to this:
RECNUM|TEXT|COMMENT
1|hi!|some comment
2|"error!|another comment
3|where does this go?|yet another comment
4|the end!"|last comment
I read the file like this:
<?php
$row
=
fgetcsv
(
$fi
,
$length
,
'|'
);
?>
This causes a problem on record 2: the quote immediately after the pipe causes the file to be read up to the following quote --in this case, in record 4. Everything in between was stored in a single element of $row.
In this particular case it is easy to spot, but my script was processing thousands of records and it took me some time to figure out what went wrong.
The annoying thing is, that there doesn't seem to be an elegant fix. You can't tell PHP not to use an enclosure --for example, like this:
<?php
$row
=
fgetcsv
(
$fi
,
$length
,
'|'
,
''
);
?>
(Well, you can tell PHP that, but it doesn't work.)
So you'd have to resort to a solution where you use an extremely unlikely enclosure, but since the enclosure can only be one character long, it may be hard to find.
Alternatively (and IMNSHO: more elegantly), you can choose to read these files like this, instead:
<?php
$line
=
fgets
(
$fi
,
$length
);
$row
=
explode
(
'|'
,
$line
);
?>
As it's more intuitive and resilient, I've decided to favor this 'construct' over fgetcsv from now on.
mortanon at gmail dot com
¶
18 years ago
Hier is an example for a CSV Iterator.
<?php
class
CsvIterator
implements
Iterator
{
const
ROW_SIZE
=
4096
;
private
$filePointer
=
null
;
private
$currentElement
=
null
;
private
$rowCounter
=
null
;
private
$delimiter
=
null
;
public function
__construct
(
$file
,
$delimiter
=
','
)
{
try {
$this
->
filePointer
=
fopen
(
$file
,
'r'
);
$this
->
delimiter
=
$delimiter
;
}
catch (
Exception $e
) {
throw new
Exception
(
'The file "'
.
$file
.
'" cannot be read.'
);
}
}
public function
rewind
() {
$this
->
rowCounter
=
0
;
rewind
(
$this
->
filePointer
);
}
public function
current
() {
$this
->
currentElement
=
fgetcsv
(
$this
->
filePointer
,
self
::
ROW_SIZE
,
$this
->
delimiter
);
$this
->
rowCounter
++;
return
$this
->
currentElement
;
}
public function
key
() {
return
$this
->
rowCounter
;
}
public function
next
() {
return !
feof
(
$this
->
filePointer
);
}
public function
valid
() {
if (!
$this
->
next
()) {
fclose
(
$this
->
filePointer
);
return
false
;
}
return
true
;
}
}
?>
Usage :
<?php
$csvIterator
= new
CsvIterator
(
'/path/to/csvfile.csv'
);
foreach (
$csvIterator
as
$row
=>
$data
) {
}
?>
mustafa dot kachwala at gmail dot com
¶
13 years ago
A simple function to return 2 Dimensional array by parsing a CSV file.
<?php
function
get2DArrayFromCsv
(
$file
,
$delimiter
) {
if ((
$handle
=
fopen
(
$file
,
"r"
)) !==
FALSE
) {
$i
=
0
;
while ((
$lineArray
=
fgetcsv
(
$handle
,
4000
,
$delimiter
)) !==
FALSE
) {
for (
$j
=
0
;
$j
<
count
(
$lineArray
);
$j
++) {
$data2DArray
[
$i
][
$j
] =
$lineArray
[
$j
];
}
$i
++;
}
fclose
(
$handle
);
}
return
$data2DArray
;
}
?>
jack dot peterson at gmail dot com
¶
13 years ago
If you receive data in the following format:
Time,Dataset1,Dataset2,
timestamp1,item 1 for dataset 1,item1 for dataset2
timestamp2,item 2 for dataset 1,item2 for dataset2
the following code will output a series of arrays grouped by column with the resulting format:
array (
[column 1 title] => array (
[timestamp1] => item1 for dataset1
[timestamp2] => item2 for dataset1
)
[column 2 title] => array (
[timestamp1] => item1 for dataset2
[timestamp2] => item2 for dataset2
)
)
<?php
if ((
$handle
=
fopen
(
"rawdata.csv"
,
"r"
)) !==
FALSE
) {
$nn
=
0
;
while ((
$data
=
fgetcsv
(
$handle
,
0
,
","
)) !==
FALSE
) {
$c
=
count
(
$data
);
for (
$x
=
0
;
$x
<
$c
;
$x
++)
{
$csvarray
[
$nn
][
$x
] =
$data
[
$x
];
}
$nn
++;
}
fclose
(
$handle
);
}
function
columnizeArray
(
$csvarray
) {
$array
= array();
foreach(
$csvarray
as
$key
=>
$value
) {
if (
$key
==
0
) {
foreach (
$value
AS
$key2
=>
$value2
) {
$array
[
$key2
] = array();
$array
[
$key2
][] =
$value2
;
}
}else if (
$key
>
0
){
foreach (
$value
as
$key3
=>
$value3
) {
$array
[
$key3
][] =
$value3
;
}
}else{
}
}
return
$array
;
}
function
groupColumns
(
$array
=
null
) {
$array2
= array();
foreach (
$array
as
$k
=>
$v
) {
if (
$k
==
0
) {}else{
$array2
[
$v
[
0
]] = array();
foreach (
$array
[
0
] as
$k1
=>
$v1
) {
if (
$v1
>
0
) {
$array2
[
$v
[
0
]][
$v1
] =
$v
[
$k1
];
}
}
}
}
return
$array2
;
}
$array2
=
groupColumns
(
columnizeArray
(
$csvarray
));
print_r
(
$array2
);
?>
fil dot dogaru at gmail dot com
¶
1 year ago
Shorter solution to the handling proposed by jack dot peterson at gmail dot com. Not sure if more efficient, but I guess nowadays you all have at least 1GB RAM :)). Let me know @email.
He wrote: „If you receive data in the following format:
Time,Dataset1,Dataset2,
timestamp1,item 1 for dataset 1,item1 for dataset2
timestamp2,item 2 for dataset 1,item2 for dataset2
the following code will output a series of arrays grouped by column with the resulting format:
array (
[column 1 title] => array (
[timestamp1] => item1 for dataset1
[timestamp2] => item2 for dataset1
)
[column 2 title] => array (
[timestamp1] => item1 for dataset2
[timestamp2] => item2 for dataset2
)
)”
$filename = "mybeautifulcsv.csv";
$collected = array_map('str_getcsv', file($filename));
$total = count($collected[0]);
for($i=0; $i<$total; $i++):
$formated[$collected[0][$i]] = array_column($collected, $i, 0);
endfor;
array_shift($formated);
//var_dump($formated);
kamil dot dratwa at gmail dot com
¶
2 years ago
This part of the length parameter behavior description is tricky, because it's not mentioning that separator is considered as a char and converted into an empty string: "Otherwise the line is split in chunks of length characters (...)".
First, take a look at the example of reading a line which does't contain separators:
<?php
file_put_contents
(
'data.csv'
,
'foo'
);
$handle
=
fopen
(
'data.csv'
,
'c+'
);
$data
=
fgetcsv
(
$handle
,
2
);
var_dump
(
$data
);
?>
Example above will output:
array(1) {
[0]=>
string(2) "fo"
}
Now let's add separators:
<?php
file_put_contents
(
'data.csv'
,
'f,o,o'
);
$handle
=
fopen
(
'data.csv'
,
'c+'
);
$data
=
fgetcsv
(
$handle
,
2
);
var_dump
(
$data
);
?>
Second example will output:
array(2) {
[0]=>
string(1) "f"
[1]=>
string(0) ""
}
Now let's alter the length:
<?php
file_put_contents
(
'data.csv'
,
'f,o,o'
);
$handle
=
fopen
(
'data.csv'
,
'c+'
);
$data
=
fgetcsv
(
$handle
,
3
);
var_dump
(
$data
);
?>
Output of the last example is:
array(2) {
[0]=>
string(1) "f"
[1]=>
string(1) "o"
}
The final conclusion is that while splitting line in chunks, separator is considered as a char during the read but then it's being converted into empty string. What's more, if separator is at the very first or last position of a chunk it will be included in the result array, but if it's somewhere between other chars, then it will be just ignored.
lewiscowles at me dot com
¶
4 years ago
In-case anyone is having difficulty working around Byte-order-marks, the following should work. As usual no warranty, you should test your code... It's for UTF-8 only
<?php
$fh
=
fopen
(
'wut.csv'
,
'r'
);
$firstThreeBytes
=
fread
(
$fh
,
3
);
if(
$firstThreeBytes
!==
"\xef\xbb\xbf"
) {
rewind
(
$fh
);
}
while((
$row
=
fgetcsv
(
$fh
,
10000
,
','
)) !==
false
) {
}
This basically reads 3 bytes
and
checks
if
they
match
https
:
Daniel Klein
¶
7 years ago
The $escape parameter is completely unintuitive, but it is not broken. Here is a breakdown of fgetcsv()'s behaviour. In the examples I've used underscores (_) to show spaces and brackets ([]) to show individual fields:
- Leading whitespace in each field will be stripped if it comes immediately before an enclosure: ___"foo" -> [foo]
- There can only be one enclosure per field, although it will be concatenated with any data that appears between the end enclosure and the next delimiter/new line, including any trailing whitespaces ___"foo"_"bar"__ -> [foo_"bar"__]
- If the field does not start with (leading whitespace +) an enclosure, the whole field is interpreted as raw data, even if enclosure characters appear elsewhere within the field: _foo"bar"_ -> [_foo"bar"_]
- Delimiters cannot be escaped outside enclosures, they have to be enclosed instead. Delimiters don't need to be escaped inside enclosures: "foo,bar","baz,qux" -> [foo,bar][baz,qux]; foo\,bar -> [foo\][bar]; "foo\,bar" -> [foo\,bar]
- Double enclosures inside single enclosures are converted to single enclosures: "foobar" -> [foobar]; "foo""bar" -> [foo"bar]; """foo""" -> ["foo"]; ""foo"" -> [foo""] (empty enclosure followed by raw data)
- The $escape parameter works as expected, but unlike enclosures DOES NOT get unescaped. It is necessary to unescape the data elsewhere in the code: "\"foo\"" -> [\"foo\"]; "foo\"bar" -> [foo\"bar]
Note: the following data (which is a very common problem) is invalid: "\". Its structure is equivalent to "@ or in other words, an open enclosure, some data and no closing enclosure.
The following functions can be used to get the expected behaviour:
<?php
function
fgetcsv_unescape_enclosures_and_escapes
(
$fh
,
$length
=
0
,
$delimiter
=
','
,
$enclosure
=
'"'
,
$escape
=
'\\'
) {
$fields
=
fgetcsv
(
$fh
,
$length
,
$delimiter
,
$enclosure
,
$escape
);
if (
$fields
) {
$regex_enclosure
=
preg_quote
(
$enclosure
);
$regex_escape
=
preg_quote
(
$escape
);
$fields
=
preg_replace
(
"/
{
$regex_escape
}
(
{
$regex_enclosure
}
|
{
$regex_escape
}
)/"
,
'$1'
,
$fields
);
}
return
$fields
;
}
function
fgetcsv_unescape_all
(
$fh
,
$length
=
0
,
$delimiter
=
','
,
$enclosure
=
'"'
,
$escape
=
'\\'
) {
$fields
=
fgetcsv
(
$fh
,
$length
,
$delimiter
,
$enclosure
,
$escape
);
if (
$fields
) {
$regex_escape
=
preg_quote
(
$escape
);
$fields
=
preg_replace
(
"/
{
$regex_escape
}
(.)/s"
,
'$1'
,
$fields
);
}
return
$fields
;
}
function
fgetcsv_unescape_all_strip_last
(
$fh
,
$length
=
0
,
$delimiter
=
','
,
$enclosure
=
'"'
,
$escape
=
'\\'
) {
$fields
=
fgetcsv
(
$fh
,
$length
,
$delimiter
,
$enclosure
,
$escape
);
if (
$fields
) {
$regex_escape
=
preg_quote
(
$escape
);
$fields
=
preg_replace
(
"/
{
$regex_escape
}
(.?)/s"
,
'$1'
,
$fields
);
}
return
$fields
;
}
?>
Caution: ideally, there shouldn't be any unescaped escape characters outside enclosures; the field should be enclosed and escaped instead. If there are any, they could end up being removed as well, depending on the function used.
vladimir at luchaninov dot com
¶
8 years ago
Here is an example how to use this function with generators
https://github.com/luchaninov/csv-file-loader
(composer require "luchaninov/csv-file-loader:1.*")
$loader = new CsvFileLoader();
$loader->setFilename('/path/to/your_data.csv');
foreach ($loader->getItems() as $item) {
var_dump($item); // do something here
}
If you have CSV-file like
id,name,surname
1,Jack,Black
2,John,Doe
you'll get 2 items
['id' => '1', 'name' => 'Jack', 'surname' => 'Black']
['id' => '2', 'name' => 'John', 'surname' => 'Doe']
Xander
¶
13 years ago
I had a problem with multibytes. File was windows-1250, script was UTF-8 and set_locale wasn't work so I made a simple and safe workaround:
<?php
$fc
=
iconv
(
'windows-1250'
,
'utf-8'
,
file_get_contents
(
$_FILES
[
'csv'
][
'tmp_name'
]));
file_put_contents
(
'tmp/import.tmp'
,
$fc
);
$handle
=
fopen
(
'tmp/import.tmp'
,
"r"
);
$rows
= array();
while ((
$data
=
fgetcsv
(
$handle
,
0
,
";"
)) !==
FALSE
) {
$rows
[] =
$data
;
}
fclose
(
$handle
);
unlink
(
'tmp/import.tmp'
);
?>
I hope You will find it out usefull.
Sorry for my english.
Anonymous
¶
18 years ago
beware of characters of binary value == 0, as they seem to make fgetcsv ignore the remaining part of a line where they appear.
Maybe this is normal under some convention I don't know, but a file exported from Excel had those as values for some cells *sometimes*, thus fgetcsv return variable cell counts for different lines.
i'm using php 4.3
kurtnorgaz at web dot de
¶
20 years ago
You should pay attention to the fact that "fgetcsv" does remove leading TAB-chars "chr(9)" while reading the file.
This means if you have a chr(9) as the first char in the file and you use fgetcsv this char is automaticaly deleted.
Example:
file content:
chr(9)first#second#third#fourth
source:
<?php $line
=
fgetcsv
(
$handle
,
500
,
"#"
);
?>
The array $line looks like:
$line[0] = first
$line[1] = second
$line[2] = third
$line[3] = fourth
and not
$line[0] = chr(9)first
$line[1] = second
$line[2] = third
$line[3] = fourth
All chr(9) after another char is not deleted!
Example:
file content:
Achr(9)first#second#third#fourth
source:
<?php $line
=
fgetcsv
(
$handle
,
500
,
"#"
);
?>
The array $line looks like:
$line[0] = Achr(9)first
$line[1] = second
$line[2] = third
$line[3] = fourth
tokai at binaryriot dot com
¶
18 years ago
Newer PHP versions handle cvs files slightly different than older versions.
"Max Mustermann"|"Muster Road 34b"|"Berlin" |"Germany"
"Sophie Master" |"Riverstreet" |"Washington"|"USA"
The extra spaces behind a few fields in the example (which are useful, when you manually manage a small csv database to align the columns) were ignored by fgetcsv from PHP 4.3. With the new 4.4.1 release they get appended to the string, so you end up with "Riverstreet " instead the expected "Riverstreet".
Easy workaround is to just trim all fields after reading them in.
<?php
while (
$data
=
fgetcsv
(
$database
,
32768
,
"|"
) )
{
$i
=
0
;
while(isset(
$data
[
$i
]))
{
$data
[
$i
] =
rtrim
(
$data
[
$i
]);
$i
++;
}
}
?>
do not spam aleske at live dot ru
¶
13 years ago
The PHP's CSV handling stuff is non-standard and contradicts with RFC4180, thus fgetcsv() cannot properly deal with files like this example from Wikipedia:
1997,Ford,E350,"ac, abs, moon",3000.00
1999,Chevy,"Venture ""Extended Edition""","",4900.00
1999,Chevy,"Venture ""Extended Edition, Very Large""","",5000.00
1996,Jeep,Grand Cherokee,"MUST SELL!
air, moon roof, loaded",4799.00
Please note: the enclosure symbol is doubled inside fields, field data can contain linebreaks, and there is no real escape symbol. Also, fputcsv() creates non-standard CSV files.
There is a quick and dirty RFC-compliant realization of CSV creation and parsing:
<?php
function
array_to_csvstring
(
$items
,
$CSV_SEPARATOR
=
';'
,
$CSV_ENCLOSURE
=
'"'
,
$CSV_LINEBREAK
=
"\n"
) {
$string
=
''
;
$o
= array();
foreach (
$items
as
$item
) {
if (
stripos
(
$item
,
$CSV_ENCLOSURE
) !==
false
) {
$item
=
str_replace
(
$CSV_ENCLOSURE
,
$CSV_ENCLOSURE
.
$CSV_ENCLOSURE
,
$item
);
}
if ((
stripos
(
$item
,
$CSV_SEPARATOR
) !==
false
)
|| (
stripos
(
$item
,
$CSV_ENCLOSURE
) !==
false
)
|| (
stripos
(
$item
,
$CSV_LINEBREAK
!==
false
))) {
$item
=
$CSV_ENCLOSURE
.
$item
.
$CSV_ENCLOSURE
;
}
$o
[] =
$item
;
}
$string
=
implode
(
$CSV_SEPARATOR
,
$o
) .
$CSV_LINEBREAK
;
return
$string
;
}
function
csvstring_to_array
(&
$string
,
$CSV_SEPARATOR
=
';'
,
$CSV_ENCLOSURE
=
'"'
,
$CSV_LINEBREAK
=
"\n"
) {
$o
= array();
$cnt
=
strlen
(
$string
);
$esc
=
false
;
$escesc
=
false
;
$num
=
0
;
$i
=
0
;
while (
$i
<
$cnt
) {
$s
=
$string
[
$i
];
if (
$s
==
$CSV_LINEBREAK
) {
if (
$esc
) {
$o
[
$num
] .=
$s
;
} else {
$i
++;
break;
}
} elseif (
$s
==
$CSV_SEPARATOR
) {
if (
$esc
) {
$o
[
$num
] .=
$s
;
} else {
$num
++;
$esc
=
false
;
$escesc
=
false
;
}
} elseif (
$s
==
$CSV_ENCLOSURE
) {
if (
$escesc
) {
$o
[
$num
] .=
$CSV_ENCLOSURE
;
$escesc
=
false
;
}
if (
$esc
) {
$esc
=
false
;
$escesc
=
true
;
} else {
$esc
=
true
;
$escesc
=
false
;
}
} else {
if (
$escesc
) {
$o
[
$num
] .=
$CSV_ENCLOSURE
;
$escesc
=
false
;
}
$o
[
$num
] .=
$s
;
}
$i
++;
}
return
$o
;
}
?>
References:
RFC4180 -
http://tools.ietf.org/html/rfc4180
Wikipedia -
http://en.wikipedia.org/wiki/Comma-separated_values#Example
Also, there is complete solution of CSV handling at
http://code.google.com/p/parsecsv-for-php/