return to I Love My Journal
A Little Closer to Center...
Musings about Life, Linux, and Latter-day Saints.
Pages
About Me
Links


Tags
PERSONAL 520
SPIRITUAL 416
LDS 312
BOOK OF MORMON 237
SCRIPTURES 154
STUDIO-JOURNEY 129
RELIGION 112
LINUX 79
COMPUTERS 65
LIFE 60
GENERAL CONFERENCE 46
GENTOO 39
MISCELLANEOUS 37
MUSIC 37
PROGRAMMING 33
CARS 29
MICROSOFT 23
FAMILY 23
AUDIO 21
I LOVE MY JOURNAL 18
FUN 15
CHILDREN 12
CURRENT EVENTS 10
NATURE'S WAY 10
VIDEO 9
DRM 9
CONEXM 7
BABBLINGS 7
PROVO CITY CENTER TEMPLE 6
FRIENDS 6
HEROD THE FINK 5
GAMES 5
COMPUTER HARDWARE 5
DRUMS 4
HAND OF GOD 3
ADVERSITY 3
KDENLIVE 3
AUDIO HARDWARE 3
GENERAL INSANITY 3
STUDIO 3
THANKS4GIVING 2
CATS 2
MY JOURNAL 1
POETRY 1
FOREVERGREEN 1
EVERYDAY THOUGHTS 1
GOSPEL 1
PARENTING 1
YOUTH CONFERENCE 1
CHURCH NOTES 1
POLITICS 1


RSS Feed

RSS FeedSubscribe!
Wed - Jul 06, 2011 : 05:59 pm
happy
   rated 0 times
>>next>>
<<previous<<
PDFTK - Generate Issues
 So, I've been working on this PDF project for awhile, and I think it might be beneficial for me to add some notes which were somewhat elusive on the net.

The FDF file which is generated by PDFTK generate is encoded in UTF-16 character set - and is very hard to work with.   I found this script which solves the problem, and allows the FDF file to still function.



I doubt it will work if the field names contain anything other than ASCII.

$ cat Project2.fdf | sed -e's/\x00//g' | sed -e's/\xFE\xFF//g' | less

(change "less" to "> "filename.txt", and it'll output the new fdf file nicely)

That's about it.
Comment by Dan on Jul. 06, 2011 @ 10:59 pm
sed -e '...' | sed -e '...'

is never necessary.

sed -e '...' -e '...'

works as well. Also cat foo | sed ... is unnecessary.  Use sed -e '...' <file

Also as you mentioned portability can't be guaranteed. Gnu sed does support unicode so it shouldn't break even on  non-ascii characters.

Personally I'd probably go Python 3 for this task because it tends to be more anal about explicitly specifying charsets and encodings in text mode (without making assumptions) than just about anything else, and also of course has ctypes and zillions of other features for binary mode. Usually doing binary transformations on streams using standard unix utilities isn't a good idea, though I think all the stuff you're using is probably unicode aware.
Comment by JAY on Aug. 29, 2012 @ 04:24 am
I hate Windows; can't do none of this stuff with it.