Text extraction from Arabic PDF General Discussions Forum

 
You must be logged in to post Login Register
Search Forums:


 






Wildcard Usage:
*    matches any number of characters
%    matches exactly one character

Text extraction from Arabic PDF

UserPost

6:38 am
11/09/2011


Mochum

New Member

posts 0

Post edited 9:20 am – 11/09/2011 by Mochum


Hi, I am a student of middle eastern languages. My task is like this: I have a few PDF files with Arabic text and want to rearrange the layout in InDesign (I only have the PDF, no other format). Copying and pasting text from the PDF results in unfortunate switched order of certain character combinations and displacement of diacritics, and text export from Acrobat to Word, RTF etc. does not seem to yield any usable results.

 

Would it help me if I had CS 5.5 ME? I have seen on other forums that others trying to do the same (pasting Arabic text from PDF to InDesign) have similar problems, but I haven't found a solution so far.

 

Or is there maybe a plugin or tool that could help me with this? I already have the IndicPlus plugin from Word-Tools, which works fine to arrange the text flow of Arabic text in InDesign, but I still have this encoding problem.

 

Fixing those errors manually seems like a nightmare! Confused

 

Regards,

Mochum

3:37 pm
11/10/2011


Harbs

Admin

posts 80

How copying Arabic text works depends on the source of the pdf among other things.

Very often the encoding is not correct.


About the in-tools.com Forum

Forum Timezone: America/New_York

Most Users Ever Online: 22

Currently Online: vihanvilletere
10 Guests

Currently Browsing this Topic:
1 Guest

Forum Stats:

Groups: 2
Forums: 22
Topics: 57
Posts: 175

Membership:

There are 2013 Members

There are 2 Admins

Top Posters:

niftyix – 8
Adrian – 8
Jahrod – 6
Mirek – 3
Saramax – 3
Artograph – 2

Recent New Members: Harbs, in-tools.com

Administrators: Harbs (80 Posts), dlandlenn (11 Posts)