Questions tagged [large-files]

Large files, whether binary or text, can sometimes be problematic even for an experienced programmer. This tag should be used if issues arise relating to opening and/or writing large files in a text editor, managing resources that run to gigabytes, or strategic decisions for large amounts of data.

Filter by
Sorted by
Tagged with
1022 votes
2 answers
2.1m views

Text editor to open big (giant, huge, large) text files [closed]

I mean 100+ MB big; such text files can push the envelope of editors. I need to look through a large XML file, but cannot if the editor is buggy. Any suggestions?
553 votes
13 answers
209k views

Managing large binary files with Git

I am looking for opinions of how to handle large binary files on which my source code (web application) is dependent. We are currently discussing several alternatives: Copy the binary files by hand. ...
pi.'s user avatar
  • 21.4k
395 votes
13 answers
1.2m views

Read and parse a Json File in C#

How does one read a very large JSON file into an array in c# to be split up for later processing? I have managed to get something working that will: Read the file Miss out headers and only read ...
Chris Devine's user avatar
  • 3,969
229 votes
20 answers
458k views

Number of lines in a file in Java

I use huge data files, sometimes I only need to know the number of lines in these files, usually I open them up and read them line by line until I reach the end of the file I was wondering if there ...
Mark's user avatar
  • 10.9k
145 votes
9 answers
136k views

What is the fastest way to create a checksum for large files in C#

I have to sync large files across some machines. The files can be up to 6GB in size. The sync will be done manually every few weeks. I cant take the filename into consideration because they can change ...
crono's user avatar
  • 3,633
126 votes
17 answers
175k views

How to find the largest file in a directory and its subdirectories?

We're just starting a UNIX class and are learning a variety of Bash commands. Our assignment involves performing various commands on a directory that has a number of folders under it as well. I know ...
Rekson's user avatar
  • 1,363
116 votes
10 answers
136k views

Working with huge files in VIM

I tried opening a huge (~2GB) file in VIM but it choked. I don't actually need to edit the file, just jump around efficiently. How can I go about working with very large files in VIM?
hoju's user avatar
  • 29.1k
114 votes
13 answers
252k views

Reading large text files with streams in C#

I've got the lovely task of working out how to handle large files being loaded into our application's script editor (it's like VBA for our internal product for quick macros). Most files are about 300-...
Nicole Lee's user avatar
  • 1,163
113 votes
13 answers
163k views

Git lfs - "this exceeds GitHub's file size limit of 100.00 MB"

I have some csv files that are larger than github's file size limit of 100.00 MB. I have been trying to use the Git Large File Storage extension. https://git-lfs.github.com/ From LFS - "Large file ...
LearningSlowly's user avatar
106 votes
4 answers
118k views

Read lines from compressed text files

Is it possible to read a line from a gzip-compressed text file using Python without extracting the file completely? I have a text.gz file which is around 200 MB. When I extract it, it becomes 7.4 GB. ...
delete_this_account's user avatar
98 votes
2 answers
112k views

HTML5 - How to stream large .mp4 files?

I'm trying to setup a very basic html5 page that loads a .mp4 video that is 20MB. It appears that the browser needs to download the entire thing rather than just playing the first part of the video ...
longda's user avatar
  • 10.3k
95 votes
11 answers
84k views

Is there a memory efficient and fast way to load big JSON files?

I have some json files with 500MB. If I use the "trivial" json.load() to load its content all at once, it will consume a lot of memory. Is there a way to read partially the file? If it was a ...
duduklein's user avatar
  • 10.3k
93 votes
12 answers
185k views

How can I import a large (14 GB) MySQL dump file into a new MySQL database?

How can I import a large (14 GB) MySQL dump file into a new MySQL database?
TRN 7's user avatar
  • 989
88 votes
8 answers
35k views

gitignore by file size?

I'm trying to implement Git to manage creative assets (Photoshop, Illustrator, Maya, etc.), and I'd like to exclude files from Git based on file size rather than extension, location, etc. For example,...
Warren Benedetto's user avatar
82 votes
24 answers
277k views

Best Free Text Editor Supporting *More Than* 4GB Files? [closed]

I am looking for a text editor that will be able to load a 4+ Gigabyte file into it. Textpad doesn't work. I own a copy of it and have been to its support site, it just doesn't do it. Maybe I need ...
Taptronic's user avatar
  • 5,139
81 votes
13 answers
245k views

How to read large text file on windows? [closed]

I have a large server log file (~750 MB) which I can't open with either Notepad or Notepad++ (they both say the file is too large). Can anyone suggest a program (for Windows) that will only read a ...
nedlud's user avatar
  • 1,832
73 votes
21 answers
58k views

Get last 10 lines of very large text file > 10GB

What is the most efficient way to display the last 10 lines of a very large text file (this particular file is over 10GB). I was thinking of just writing a simple C# app but I'm not sure how to do ...
Chris Conway's user avatar
  • 16.4k
60 votes
3 answers
94k views

Large file upload though html form (more than 2 GB)

Is there anyway to upload a file more than 2 GB, using simple html form upload? Previously I have been uploading large files through silverlight using chunking (dividing a large file into segments and ...
Nadeem Ullah's user avatar
58 votes
7 answers
80k views

Git with large files

Situation I have two servers, Production and Development. On Production server, there are two applications and multiple (6) databases (MySQL) which I need to distribute to developers for testing. All ...
Jakub Riedl's user avatar
  • 1,076
47 votes
3 answers
31k views

How do I read a large CSV file with Scala Stream class?

How do I read a large CSV file (> 1 Gb) with a Scala Stream? Do you have a code example? Or would you use a different way to read a large CSV file without loading it into memory first?
Jan Willem Tulp's user avatar
45 votes
3 answers
24k views

How to efficiently write large files to disk on background thread (Swift)

Update I have resolved and removed the distracting error. Please read the entire post and feel free to leave comments if any questions remain. Background I am attempting to write relatively large ...
Tommie C.'s user avatar
  • 13.1k
45 votes
8 answers
56k views

Upload 1GB files using chunking in PHP

I have a web application that accepts file uploads of up to 4 MB. The server side script is PHP and web server is NGINX. Many users have requested to increase this limit drastically to allow upload of ...
rjha94's user avatar
  • 4,298
44 votes
6 answers
40k views

Using Python Iterparse For Large XML Files

I need to write a parser in Python that can process some extremely large files ( > 2 GB ) on a computer without much memory (only 2 GB). I wanted to use iterparse in lxml to do it. My file is of the ...
Dave Johnshon's user avatar
43 votes
6 answers
30k views

Searching for a string in a large text file - profiling various methods in python

This question has been asked many times. After spending some time reading the answers, I did some quick profiling to try out the various methods mentioned previously... I have a 600 MB file with ...
user's user avatar
  • 18.2k
42 votes
15 answers
64k views

Java : Read last n lines of a HUGE file

I want to read the last n lines of a very big file without reading the whole file into any buffer/memory area using Java. I looked around the JDK APIs and Apache Commons I/O and am not able to locate ...
Gaurav Verma's user avatar
40 votes
8 answers
64k views

Writing large files with Node.js

I'm writing a large file with node.js using a writable stream: var fs = require('fs'); var stream = fs.createWriteStream('someFile.txt', { flags : 'w' }); var lines; while (lines = getLines()) { ...
nab's user avatar
  • 4,791
40 votes
6 answers
100k views

Large file upload with WebSocket

I'm trying to upload large files (at least 500MB, preferably up to a few GB) using the WebSocket API. The problem is that I can't figure out how to write "send this slice of the file, release the ...
Vlad Ciobanu's user avatar
  • 1,473
39 votes
9 answers
85k views

How to read line-delimited JSON from large file (line by line)

I'm trying to load a large file (2GB in size) filled with JSON strings, delimited by newlines. Ex: { "key11": value11, "key12": value12, } { "key21": value21, "key22": value22, } … ...
Cat's user avatar
  • 7,172
37 votes
13 answers
29k views

Very large uploads with PHP

I want to allow uploads of very large files into our PHP application (hundred of megs - 8 gigs). There are a couple of problems with this however. Browser: HTML uploads have crappy feedback, we need ...
Evert's user avatar
  • 96.7k
33 votes
8 answers
6k views

large amount of data in many text files - how to process?

I have large amounts of data (a few terabytes) and accumulating... They are contained in many tab-delimited flat text files (each about 30MB). Most of the task involves reading the data and ...
hatmatrix's user avatar
  • 43.9k
33 votes
8 answers
18k views

Binary search in a sorted (memory-mapped ?) file in Java

I am struggling to port a Perl program to Java, and learning Java as I go. A central component of the original program is a Perl module that does string prefix lookups in a +500 GB sorted text file ...
sds's user avatar
  • 383
33 votes
3 answers
47k views

Are there any good workarounds to the GitHub 100MB file size limit for text files?

I have a 190 MB plain text file that I want to track on github. The text file is a pronounciation lexicon file for our text-to-speech engine. We regularly add and modify lines in the text files, and ...
josteinaj's user avatar
  • 477
32 votes
8 answers
101k views

Reading very large files in PHP

fopen is failing when I try to read in a very moderately sized file in PHP. A 6 meg file makes it choke, though smaller files around 100k are just fine. i've read that it is sometimes necessary to ...
user avatar
31 votes
6 answers
66k views

Python: How to read huge text file into memory

I'm using Python 2.6 on a Mac Mini with 1GB RAM. I want to read in a huge text file $ ls -l links.csv; file links.csv; tail links.csv -rw-r--r-- 1 user user 469904280 30 Nov 22:42 links.csv links....
asmaier's user avatar
  • 11.4k
28 votes
7 answers
54k views

Best way to process large XML in PHP [duplicate]

I have to parse large XML files in php, one of them is 6.5 MB and they could be even bigger. The SimpleXML extension as I've read, loads the entire file into an object, which may not be very efficient....
Petruza's user avatar
  • 12k
28 votes
7 answers
39k views

Processing large JSON files in PHP

I am trying to process somewhat large (possibly up to 200M) JSON files. The structure of the file is basically an array of objects. So something along the lines of: [ {"property":"value", "...
The Mighty Rubber Duck's user avatar
27 votes
5 answers
30k views

Read lines by number from a large file

I have a file with 15 million lines (will not fit in memory). I also have a small vector of line numbers - the lines that I want to extract. How can I read-out the lines in one pass? I was hoping ...
Aleksandr Levchuk's user avatar
26 votes
8 answers
17k views

ERROR: could not stat file "XX.csv": Unknown error

I run this command: COPY XXX FROM 'D:/XXX.csv' WITH (FORMAT CSV, HEADER TRUE, NULL 'NULL') In Windows 7, it successfully imports CSV files of less than 1GB. If the file is more then 1GB big, I get ...
亚军吴's user avatar
  • 391
26 votes
13 answers
20k views

PHP x86 How to get filesize of > 2 GB file without external program?

I need to get the file size of a file over 2 GB in size. (testing on 4.6 GB file). Is there any way to do this without an external program? Current status: filesize(), stat() and fseek() fails ...
Honza Kuchař's user avatar
25 votes
4 answers
4k views

Is it possible to slim a .git repository without rewriting history?

We have a number of git repositories which have grown to an unmanageable size due to the historical inclusion of binary test files and java .jar files. We are just about to go through the exercise of ...
Mark Booth's user avatar
  • 7,794
25 votes
1 answer
28k views

How does HTTP file upload work for large files?

I just want to elaborate this question: How does HTTP file upload work?. This is the form from the question: <form enctype="multipart/form-data" action="http://localhost:3000/upload?...
nxh's user avatar
  • 1,049
24 votes
6 answers
49k views

How do I download a large file (via HTTP) in .NET?

I need to download a large file (2 GB) over HTTP in a C# console application. Problem is, after about 1.2 GB, the application runs out of memory. Here's the code I'm using: WebClient ...
Nick Cartwright's user avatar
24 votes
3 answers
22k views

How to fix: "The file is too large: __ , showing a read-only preview of the first: __" in Intellij IDEA?

I am trying to view a large file in Intellij IDEA, but I am coming across the error: "The file is too large: 30.1 MB, showing a read-only preview of the first 2.56 MB". I have seen some previous ...
CuriousProgrammer70184's user avatar
21 votes
6 answers
48k views

How to avoid OutOfMemoryError when uploading a large file using Jersey client

I am using Jersey client for http-based request. It works well if the file is small but run into error when I post a file with size of 700M: Exception in thread "main" java.lang.OutOfMemoryError: ...
Mr rain's user avatar
  • 993
21 votes
7 answers
26k views

Python Random Access File

Is there a Python file type for accessing random lines without traversing the whole file? I need to search within a large file, reading the whole thing into memory wouldn't be possible. Any types or ...
Mantas Vidutis's user avatar
21 votes
2 answers
24k views

FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - JavaScript heap out of memory when processing large files with fs

I have a nodeJs script that process a bunch of large .csv files (1.3GB for all). It run for a moment and throw this error: FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed - ...
TOPKAT's user avatar
  • 7,711
21 votes
3 answers
38k views

Downloading a Large File - iPhone SDK

I am using Erica Sadun's method of Asynchronous Downloads (link here for the project file: download), however her method does not work with files that have a big size (50 mb or above). If I try to ...
lab12's user avatar
  • 6,440
20 votes
8 answers
14k views

How do I join pairs of consecutive lines in a large file (1 million lines) using vim, sed, or another similar tool?

I need to move the contents of every second line up to the line above such that line2's data is alongside line1's, either comma or space separated works. Input: line1 line2 line3 line4 Output: ...
janeruthh's user avatar
  • 203
19 votes
4 answers
27k views

Edit very large sql dump/text file (on linux)

I have to import a large mysql dump (up to 10G). However the sql dump already predefined with a database structure with index definition. I want to speed up the db insert by removing the index and ...
geo's user avatar
  • 1,001
19 votes
4 answers
18k views

How may I scroll with vim into a big file?

I have a big file with thousands of lines of thousands of characters. I move the cursor to 3000th character. If I use PageDown or Ctrl+D, the file will scroll but the cursor will come back to the ...
Luc M's user avatar
  • 17k

1
2 3 4 5
35