Questions tagged [extract]

Questions related to retrieving specific information from a (typically minimally structured) data source, such as a web site, media file, source code collection or compressed archive (in which case the desired information is one or more original, uncompressed files). When using this tag, please include additional tags to clarify which specific environment/language/scenario your question refers to.

Filter by
Sorted by
Tagged with
1143 votes
13 answers
1.3m views

How to get first N number of elements from an array

I am working with Javascript(ES6) /FaceBook react and trying to get the first 3 elements of an array that varies in size. I would like do the equivalent of Linq take(n). In my Jsx file I have the ...
user1526912's user avatar
  • 16.5k
676 votes
11 answers
334k views

The difference between bracket [ ] and double bracket [[ ]] for accessing the elements of a list or dataframe

R provides two different methods for accessing the elements of a list or data.frame: [] and [[]]. What is the difference between the two, and when should I use one over the other?
Sharpie's user avatar
  • 17.5k
325 votes
20 answers
1.1m views

How do you extract a column from a multi-dimensional array?

Does anybody know how to extract a column from a multi-dimensional array in Python?
user avatar
267 votes
15 answers
683k views

How to extract all values from a dictionary in Python?

I have a dictionary d = {1:-0.3246, 2:-0.9185, 3:-3985, ...}. How do I extract all of the values of d into a list l?
Naveen C.'s user avatar
  • 3,295
232 votes
2 answers
89k views

How do I move a Git branch out into its own repository?

I have a branch that I'd like to move into a separate Git repository, and ideally keep that branch's history in the process. So far I've been looking at git filter-branch, but I can't make out ...
Aupajo's user avatar
  • 5,965
209 votes
7 answers
327k views

How can I extract the folder path from file path in Python?

I would like to get just the folder path from the full path to a file. For example T:\Data\DBDesign\DBDesign_93_v141b.mdb and I would like to get just T:\Data\DBDesign (excluding the \...
Genspec's user avatar
  • 2,339
207 votes
7 answers
234k views

Accessing last x characters of a string in Bash

I found out that with ${string:0:3} one can access the first 3 characters of a string. Is there a equivalently easy method to access the last three characters?
aldorado's user avatar
  • 4,624
188 votes
15 answers
307k views

How to extract text from a PDF? [closed]

Can anyone recommend a library/API for extracting the text and images from a PDF? We need to be able to get at text that is contained in pre-known regions of the document, so the API will need to give ...
Budda007's user avatar
  • 1,913
187 votes
9 answers
388k views

Extract first item of each sublist in Python

I'm wondering what is the best way to extract the first item of each sublist in a list of lists and append it to a new list. So if I have: lst = [[a,b,c], [1,2,3], [x,y,z]] And, I want to pull out a, ...
konrad's user avatar
  • 3,616
183 votes
17 answers
257k views

How to get the first word of a sentence in PHP?

I want to extract the first word of a variable from a string. For example, take this input: <?php $myvalue = 'Test me more'; ?> The resultant output should be Test, which is the first word of ...
ali's user avatar
  • 1,897
173 votes
17 answers
260k views

How to extract one column of a csv file

If I have a csv file, is there a quick bash way to print out the contents of only any single column? It is safe to assume that each row has the same number of columns, but each column's content would ...
user788171's user avatar
  • 17.2k
166 votes
15 answers
413k views

Javascript - How to extract filename from a file input control

When a user selects a file in a web page I want to be able to extract just the filename. I did try str.search function but it seems to fail when the file name is something like this: c:\uploads\ilike....
Yogi Yang 007's user avatar
161 votes
7 answers
363k views

How to extract a floating number from a string [duplicate]

I have a number of strings similar to Current Level: 13.4 db. and I would like to extract just the floating point number. I say floating and not decimal as it's sometimes whole. Can RegEx do this or ...
Ben Keating's user avatar
  • 8,266
141 votes
5 answers
295k views

Get string after character [duplicate]

I have a string that looks like this: GenFiltEff=7.092200e-01 Using bash, I would like to just get the number after the = character. Is there a way to do this?
user788171's user avatar
  • 17.2k
118 votes
23 answers
198k views

Extract images from PDF without resampling, in python?

How might one extract all images from a pdf document, at native resolution and format? (Meaning extract tiff as tiff, jpeg as jpeg, etc. and without resampling). Layout is unimportant, I don't care ...
matt wilkie's user avatar
  • 17.8k
106 votes
4 answers
29k views

What algorithm does Readability use for extracting text from URLs?

For a while, I've been trying to find a way of intelligently extracting the "relevant" text from a URL by eliminating the text related to ads and all the other clutter.After several months of ...
user300981's user avatar
  • 1,423
94 votes
6 answers
232k views

Read Content from Files which are inside Zip file

I am trying to create a simple java program which reads and extracts the content from the file(s) inside zip file. Zip file contains 3 files (txt, pdf, docx). I need to read the contents of all these ...
S Jagdeesh's user avatar
  • 1,533
92 votes
8 answers
159k views

Extract and delete all .gz in a directory- Linux [closed]

I have a directory. It has about 500K .gz files. How can I extract all .gz in that directory and delete the .gz files?
user2247643's user avatar
88 votes
9 answers
527k views

how to extract only the year from the date in sql server 2008?

In sql server 2008, how to extract only the year from the date. In DB I have a column for date, from that I need to extract the year. Is there any function for that?
Praveen's user avatar
  • 56k
87 votes
5 answers
220k views

pandas extract year from datetime: df['year'] = df['date'].year is not working

I import a dataframe via read_csv, but for some reason can't extract the year or month from the series df['date'], trying that gives AttributeError: 'Series' object has no attribute 'year': date ...
MJS's user avatar
  • 1,603
84 votes
8 answers
125k views

How to parse the Manifest.mbdb file in an iOS 4.0 iTunes Backup

In iOS 4.0 Apple has redesigned the backup process. iTunes used to store a list of filenames associated with backup files in the Manifest.plist file, but in iOS 4.0 it has moved this information to ...
Padraig's user avatar
  • 1,569
83 votes
7 answers
49k views

Extracting an information from web page by machine learning

I would like to extract a specific type of information from web pages in Python. Let's say postal address. It has thousands of forms, but still, it is somehow recognizable. As there is a large number ...
Honza Javorek's user avatar
75 votes
5 answers
77k views

Extract files from zip without keeping the structure using python ZipFile?

I try to extract all files from .zip containing subfolders in one folder. I want all the files from subfolders extract in only one folder without keeping the original structure. At the moment, I ...
Thammas's user avatar
  • 1,003
71 votes
8 answers
94k views

How to extract top-level domain name (TLD) from URL

how would you extract the domain name from a URL, excluding any subdomains? My initial simplistic attempt was: '.'.join(urlparse.urlparse(url).netloc.split('.')[-2:]) This works for http://www.foo....
hoju's user avatar
  • 29.1k
70 votes
11 answers
180k views

Extract the text out of HTML string using JavaScript

I am trying to get the inner text of HTML string, using a JS function(the string is passed as an argument). Here is the code: function extractContent(value) { var content_holder = ""; for ...
Toshkuuu's user avatar
  • 825
67 votes
8 answers
293k views

Extract MSI from EXE

I want to extract the MSI of an EXE setup to publish over a network. For example, using Universal Extractor, but it doesn't work for Java Runtime Environment.
emdadgar2's user avatar
  • 820
61 votes
6 answers
89k views

How to extract just plain text from .doc & .docx files? [closed]

Anyone know of anything they can recommend in order to extract just the plain text from a .doc or .docx? I've found this - wondered if there were any other suggestions?
docextract's user avatar
56 votes
4 answers
214k views

Java: export to an .jar file in eclipse

I'm trying to export a program in Eclipse to a jar file. In my project I have added some pictures and PDF:s. When I'm exporting to jar file, it seems that only the main has been compiled and ...
Adis's user avatar
  • 804
55 votes
6 answers
15k views

How to extract one file with commit history from a Git repo with index-filter & co?

I have a Git repo converted from SVN to Mercurial to Git, and I wanted to extract just one source file. I also had weird characters like aÌ (an encoding mismatch corrupted Unicode ä) and spaces in the ...
peterhil's user avatar
  • 1,556
51 votes
9 answers
151k views

Get min and max value in PHP Array

I have an array like this: array (0 => array ( 'id' => '20110209172713', 'Date' => '2011-02-09', 'Weight' => '200', ), 1 => array ( 'id' => '20110209172747', 'Date' => '...
Peter's user avatar
  • 1,294
47 votes
17 answers
31k views

What is so wrong with extract()?

I was recently reading this thread, on some of the worst PHP practices. In the second answer there is a mini discussion on the use of extract(), and im just wondering what all the huff is about. I ...
barfoon's user avatar
  • 27.8k
45 votes
3 answers
128k views

Extract string before "|" [duplicate]

I have a data set wherein a column looks like this: ABC|DEF|GHI, ABCD|EFG|HIJK, ABCDE|FGHI|JKL, DEF|GHIJ|KLM, GHI|JKLM|NO|PQRS, BCDE|FGHI|JKL .... and so on I need to extract the ...
Shounak Chakraborty's user avatar
42 votes
3 answers
105k views

Extract string from between quotations

I want to extract information from user-inputted text. Imagine I input the following: SetVariables "a" "b" "c" How would I extract information between the first set of quotations? Then the second? ...
Reznor's user avatar
  • 1,255
42 votes
6 answers
44k views

Extract .xip file into a folder from command line?

Apple occasionally uses a proprietary XIP file format, particularly when distributing versions of Xcode. It is an analog to zip, but is signed, allowing it to verified on the receiving system. When a ...
Antony Raphel's user avatar
41 votes
12 answers
29k views

Extracting information from PDFs of research papers [closed]

I need a mechanism for extracting bibliographic metadata from PDF documents, to save people entering it by hand or cut-and-pasting it. At the very least, the title and abstract. The list of authors ...
Christopher Gutteridge's user avatar
41 votes
7 answers
110k views

How to extract data from a PDF file while keeping track of its structure?

My objective is to extract the text and images from a PDF file while parsing its structure. The scope for parsing the structure is not exhaustive; I only need to be able to identify headings and ...
Marcel's user avatar
  • 6,244
40 votes
3 answers
53k views

C# regex pattern to extract urls from given string - not full html urls but bare links as well

I need a regex which will do the following Extract all strings which starts with http:// Extract all strings which starts with www. So i need to extract these 2. For example there is this given ...
Furkan Gözükara's user avatar
39 votes
3 answers
71k views

JAR - extracting specific files

I have .class and .java files in JAR archive. Is there any way to extract only .java files from it? I've tried this command but it doesn't work: jar xf jar-file.jar *.java
user3521479's user avatar
38 votes
7 answers
83k views

Extract digits from string - Google spreadsheet

In Google spreadsheets, I need a formula to extract all digits (0 to 9) contained into an arbitrary string, that might contain any possible character and put them into a single cell. Examples (Input -...
thanos.a's user avatar
  • 2,416
38 votes
3 answers
180k views

Use binwalk to extract all files

I have a file music.mp3. After using binwalk, I get the result: pexea12@DESMICE:~/Downloads$ binwalk music.mp3 DECIMAL HEXADECIMAL DESCRIPTION ----------------------------------------------...
pexea12's user avatar
  • 1,159
37 votes
6 answers
75k views

How do you extract a url from a string using python?

For example: string = "This is a link http://www.google.com" How could I extract 'http://www.google.com' ? (Each link will be of the same format i.e 'http://')
Sheldon's user avatar
  • 9,949
37 votes
5 answers
96k views

How to extract metadata from an image using python?

How can I extract metadata from an image using Python?
MrDanger's user avatar
  • 391
36 votes
3 answers
53k views

Extract a ZIP file programmatically by DotNetZip library?

I have a function that get a ZIP file and extract it to a directory (I use DotNetZip library.) public void ExtractFileToDirectory(string zipFileName, string outputDirectory) { ZipFile zip = ...
Ehsan's user avatar
  • 3,461
36 votes
2 answers
12k views

Extract part of a git repository?

Assume my git repository has the following structure: /.git /Project /Project/SubProject-0 /Project/SubProject-1 /Project/SubProject-2 and the repository has quite some commits. Now one of the ...
Rio's user avatar
  • 1,897
35 votes
3 answers
70k views

How can I untar a tar.bz file in unix?

I've found tons of pages saying how to untar tar.bz2 files, but how would one untar a tar.bz file?
Tim's user avatar
  • 4,335
34 votes
3 answers
25k views

Java library for keywords extraction from input text [closed]

I'm looking for a Java library to extract keywords from a block of text. The process should be as follows: stop word cleaning -> stemming -> searching for keywords based on English linguistics ...
Shay's user avatar
  • 507
33 votes
6 answers
88k views

Extracting text from PDFs in C# [closed]

Pretty simply, I need to rip text out of multiple PDFs (quite a lot actually) in order to analyse the contents before sticking it in an SQL database. I've found some pretty sketchy free C# libraries ...
Duncan Tait's user avatar
  • 2,027
32 votes
11 answers
61k views

How can I extract multiple 7z files in folder at once in Ubuntu?

How can I extract about 900 7z files which are all located in the same folder (all have only one file inside) without doing it one by one? I am using Ubuntu 10.10. All files are located in /home/...
Robert Cardona's user avatar
31 votes
3 answers
54k views

Extract first word from a column and insert into new column [duplicate]

I have a dataframe below and want to extract the first word and insert it into a new column Dataframe1: COL1 Nick K Jones Dave G Barros Matt H Smith Convert it to this: Dataframe2: COL1 ...
Nick's user avatar
  • 853
30 votes
4 answers
60k views

Regular expressions C# - is it possible to extract matches while matching?

Say, I have a string that I need to verify the correct format of; e.g. RR1234566-001 (2 letters, 7 digits, dash, 1 or more digits). I use something like: Regex regex = new Regex(patternString)...
sarsnake's user avatar
  • 27.2k

1
2 3 4 5
158