parse a string for the last few characters

Code junkies hangout here

Moderators: ChriThor, LXF moderators

parse a string for the last few characters

Postby guy » Thu Jul 21, 2016 10:23 am

Using bash, I am trying to parse a string of undefined length to check whether the last five characters are correct, in particular whether the value of NAME does not yet have the dot extension ".epub" appended.
It needs to cope if the string is shorter than the extension.

Trying variations on this:
Code: Select all
if [ ! NAME=*.epub ]

the endless variations on dollar prefixes, enclosing quotes, white space and double-equals in the bash documentation are driving me mental. Can anybody explain what is the correct code and why?
"Klinger, do you know how many zoots were killed to make that one suit?" — BJ Hunnicutt, 4077 M*A*S*H
guy
LXF regular
 
Posts: 1282
Joined: Thu Apr 07, 2005 12:07 pm
Location: Worcestershire

Re: parse a string for the last few characters

Postby Bazza » Thu Jul 21, 2016 2:41 pm

Hi guy...
This is totally longhand from the command line and is a starter for you...
iMac, OSX 10.11.5, default bash terminal...
Code: Select all
Last login: Thu Jul 21 15:37:34 on ttys000
AMIGA:barrywalker~> strng="abc.epub"
AMIGA:barrywalker~> length=${#strng}
AMIGA:barrywalker~> substr="${strng:$((length-5)):5}"
AMIGA:barrywalker~> echo "$substr"
.epub
AMIGA:barrywalker~> if [ "$substr" = ".epub" ]; then echo "Do summat."; fi
Do summat.
AMIGA:barrywalker~> strng="ep"
AMIGA:barrywalker~> length=${#strng}
AMIGA:barrywalker~> substr="${strng:$((length-5)):5}"
AMIGA:barrywalker~> echo "$substr"

AMIGA:barrywalker~> if [ "$substr" = ".epub" ]; then echo "Do summat."; fi
AMIGA:barrywalker~> if [ "$substr" != ".epub" ]; then echo "Do summat."; fi
Do summat.
AMIGA:barrywalker~> _
73...

Bazza, G0LCU...

Team AMIGA...

The less that I speak, the smarter I sound.
User avatar
Bazza
LXF regular
 
Posts: 1565
Joined: Sat Mar 21, 2009 11:16 am
Location: Loughborough

Re: parse a string for the last few characters

Postby nelz » Thu Jul 21, 2016 3:07 pm

* is a shell filename expansion character, it is not used in string comparisons. I can think of two ways (so far) of doing this. The first is

Code: Select all
if ! echo $NAME | grep -q .epub$; then


This checks that the end of the string ends with .epub. You can also use the shell's variable manipulation, like this

Code: Select all
if [[ "${NAME}" == "${NAME%.epub}" ]] then


The % strips the followinf string from the end of the variable, if it is there. Otherwise it returns the string unchanged. If all you want to do is check whether NAME ends in .epub and add it if not, you can avoid the if statement altogether and do this

Code: Select all
NAME=${NAME%.epub}.epub


which basically removes the extension, if there, and then adds it back.

BTW this is all standard POSIX, not Bash, so you should use /bin/sh in the shebang for maximum portability.
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
User avatar
nelz
Site admin
 
Posts: 8964
Joined: Mon Apr 04, 2005 11:52 am
Location: Warrington, UK

Re: parse a string for the last few characters

Postby guy » Thu Jul 21, 2016 3:22 pm

@Bazza Assuming I have NAME all set up, that should be something like:
Code: Select all
length=${#NAME}
endchars="${NAME:$((length-5)):5}"
echo "$endchars"
if [ ! "$endchars" = ".epub" ]; then NAME=$NAME.epbub
echo "$NAME"; fi

right? Note the pling (!) for "not". I haven't time to try it now, will have a go later. Thanks.

@Nelz looks neater, will take a look at that too. Must dash.
"Klinger, do you know how many zoots were killed to make that one suit?" — BJ Hunnicutt, 4077 M*A*S*H
guy
LXF regular
 
Posts: 1282
Joined: Thu Apr 07, 2005 12:07 pm
Location: Worcestershire

Re: parse a string for the last few characters

Postby nelz » Thu Jul 21, 2016 3:29 pm

I take it you mean my code? Although having met Bazza this is one of the few times I wouldn't disagree with your statement as it stands :lol:
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
User avatar
nelz
Site admin
 
Posts: 8964
Joined: Mon Apr 04, 2005 11:52 am
Location: Warrington, UK

Re: parse a string for the last few characters

Postby Bazza » Thu Jul 21, 2016 4:06 pm

Hi Nelz...
I thought that...
Code: Select all
 [[ ... ]]

...was not POSIX compliant and was a bash and ksh feature.
(If I am wrong I stand corrected.)

Also I wanted to generate two random numbers in POSIX and found that "/dev/random" and "/dev/urandom" were not a necessary part of the device listing so I couldn't use those for my needs.

Getting deep into this POSIX lark means one has to check every external, (transient), command for their respective POSIX compliance.

The builtin 'read' staement is a pig as the only requirement is the '-r' switch... Ouch!
73...

Bazza, G0LCU...

Team AMIGA...

The less that I speak, the smarter I sound.
User avatar
Bazza
LXF regular
 
Posts: 1565
Joined: Sat Mar 21, 2009 11:16 am
Location: Loughborough

Re: parse a string for the last few characters

Postby Bazza » Thu Jul 21, 2016 4:13 pm

nelz wrote:I take it you mean my code? Although having met Bazza this is one of the few times I wouldn't disagree with your statement as it stands :lol:

"Friends; I have a whole host of friends.''
Mississippi: El Derado.
(My turn will come, evil grin.)
73...

Bazza, G0LCU...

Team AMIGA...

The less that I speak, the smarter I sound.
User avatar
Bazza
LXF regular
 
Posts: 1565
Joined: Sat Mar 21, 2009 11:16 am
Location: Loughborough

Re: parse a string for the last few characters

Postby nelz » Thu Jul 21, 2016 4:28 pm

Bazza wrote:Hi Nelz...
I thought that...
Code: Select all
 [[ ... ]]

...was not POSIX compliant and was a bash and ksh feature.


It is, but I'm that used to using it I typed it instead of [..].
Having said that, it still works as expected with /bin/sh, but it is undefined, use single brackets instead.
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
User avatar
nelz
Site admin
 
Posts: 8964
Joined: Mon Apr 04, 2005 11:52 am
Location: Warrington, UK

Re: parse a string for the last few characters

Postby guy » Thu Jul 21, 2016 4:58 pm

nelz wrote:It is, but I'm that used to using it I typed it instead of [..].

are you that used to omitting the line separator before the "then" as well? ;)

This is now working:
Code: Select all
if [ ${NAME} == ${NAME%.epub} ]; then
NAME=$NAME.epub
fi

The == bit puzzles me, I can't find mention of it: is it a strong equals or a not equals?

Also, Pluma (ie Gedit) highlights the second closing curly bracket as if it were a command - any idea why that is?
"Klinger, do you know how many zoots were killed to make that one suit?" — BJ Hunnicutt, 4077 M*A*S*H
guy
LXF regular
 
Posts: 1282
Joined: Thu Apr 07, 2005 12:07 pm
Location: Worcestershire

Re: parse a string for the last few characters

Postby guy » Thu Jul 21, 2016 5:34 pm

nelz wrote:BTW this is all standard POSIX, not Bash, so you should use /bin/sh in the shebang for maximum portability.

Something in the whole thing isn't posix. I changed to /bin/sh and the dot extension became .zip
Here is the current working version:
Code: Select all
#!/bin/bash
# makepub version 0.0.2   21 Jul 2016
#
# Builds an ePub document from a prepared xml source file tree
# If no file name is provided, defaults to new.epub
# Overwrites any existing mimetype and creates META-INF/container.xml

# input file name from terminal
echo -n "Enter ePub file name and press [ENTER]: "
read NAME

# check for null return
if [ -z $NAME ]; then
NAME=new
fi

# ensure .epub dot extension is added
if [ ${NAME} == ${NAME%.epub} ]; then
NAME=$NAME.epub
fi

# create epub file containing mimetype, uncompressed
echo "application/epub+zip" > mimetype
zip -X0 $NAME mimetype
rm mimetype

# add the xml container, compressed
mkdir META-INF
echo > META-INF/container.xml
printf '<?xml version="1.0" encoding="UTF-8"?>\n<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">\n    <rootfiles>\n        <rootfile full-path="OEBPS/content.opf" media-type="application/oebps-package+xml"/>\n   </rootfiles>\n</container>' >> META-INF/container.xml
zip -X9Dr $NAME META-INF
rm META-INF/container.xml
rmdir META-INF

# add the rest, compressed
zip -X9Dr $NAME OEBPS
"Klinger, do you know how many zoots were killed to make that one suit?" — BJ Hunnicutt, 4077 M*A*S*H
guy
LXF regular
 
Posts: 1282
Joined: Thu Apr 07, 2005 12:07 pm
Location: Worcestershire

Re: parse a string for the last few characters

Postby Bazza » Thu Jul 21, 2016 5:37 pm

http://www.tutorialspoint.com/unix/unix ... rators.htm
About 1/3 way down under arithmetic operators.
As far as I know also not a POSIX requirement but bash and others handle it well as per this URL.
(I use it in AudioScope.sh.)
73...

Bazza, G0LCU...

Team AMIGA...

The less that I speak, the smarter I sound.
User avatar
Bazza
LXF regular
 
Posts: 1565
Joined: Sat Mar 21, 2009 11:16 am
Location: Loughborough

Re: parse a string for the last few characters

Postby nelz » Thu Jul 21, 2016 7:59 pm

guy wrote:
nelz wrote:It is, but I'm that used to using it I typed it instead of [..].

are you that used to omitting the line separator before the "then" as well? ;)


Really Guy, I expect smart-arse remarks like that from Bazza, not you :shock:

guy wrote:This is now working:
Code: Select all
if [ ${NAME} == ${NAME%.epub} ]; then
NAME=$NAME.epub
fi

The == bit puzzles me, I can't find mention of it: is it a strong equals or a not equals?


It's an equality to test, the single = is an assignment operator. Try this

Code: Select all
if [ "1"="2" ]; then
    echo true
else
    echo false
    fi



guy wrote:Also, Pluma (ie Gedit) highlights the second closing curly bracket as if it were a command - any idea why that is?


Broken syntax parser?
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
User avatar
nelz
Site admin
 
Posts: 8964
Joined: Mon Apr 04, 2005 11:52 am
Location: Warrington, UK

Re: parse a string for the last few characters

Postby nelz » Thu Jul 21, 2016 8:00 pm

guy wrote:
Code: Select all
#!/bin/bash
# ensure .epub dot extension is added
if [ ${NAME} == ${NAME%.epub} ]; then
NAME=$NAME.epub
fi


You can reduce that to a single line, as above, if you want.
"Insanity: doing the same thing over and over again and expecting different results." (Albert Einstein)
User avatar
nelz
Site admin
 
Posts: 8964
Joined: Mon Apr 04, 2005 11:52 am
Location: Warrington, UK

Re: parse a string for the last few characters

Postby guy » Fri Jul 22, 2016 5:58 am

nelz wrote:Really Guy, I expect smart-arse remarks like that from Bazza, not you :shock:

I am learning from my betters (let us not enquire too deeply about elders) :D

I prefer the two-line version as its logic is more directly self-explanatory should I ever have cause to revisit. Were I into brevity it would be a choice between perl and machine code.
"Klinger, do you know how many zoots were killed to make that one suit?" — BJ Hunnicutt, 4077 M*A*S*H
guy
LXF regular
 
Posts: 1282
Joined: Thu Apr 07, 2005 12:07 pm
Location: Worcestershire

Re: parse a string for the last few characters

Postby Bazza » Fri Jul 22, 2016 7:28 am

Hi Nelz...
Note the spaces...
Code: Select all
Last login: Fri Jul 22 08:19:24 on ttys000
AMIGA:barrywalker~> sh --posix
AMIGA:barrywalker~> if [ "1" = "1" ]; then echo true; else echo false; fi
true
AMIGA:barrywalker~> if [ "1" = "2" ]; then echo true; else echo false; fi
false
AMIGA:barrywalker~> if [ "1" == "2" ]; then echo true; else echo false; fi
false
AMIGA:barrywalker~> if [ "1" == "1" ]; then echo true; else echo false; fi
true
AMIGA:barrywalker~> _

Both do the same job, but I was given to understand that the "==" was not backwards compatible.
73...

Bazza, G0LCU...

Team AMIGA...

The less that I speak, the smarter I sound.
User avatar
Bazza
LXF regular
 
Posts: 1565
Joined: Sat Mar 21, 2009 11:16 am
Location: Loughborough

Next

Return to Programming

Who is online

Users browsing this forum: No registered users and 0 guests