Jump to content
Sign in to follow this  
ianw1974

Replacing some text

Recommended Posts

Will be interesting to see if anyone knows how I can do this :)

 

I need to replace some text, the first 6 - 9 characters are numbers. Then there are two letters, of which then follow four more characters which are a mix of numbers. There is a possibility that the two letters that proceed the first 6 - 9 characters could occur in the last four, so need to make sure I only replace the first instance that follows the first 6 - 9 characters. I'm thinking of sed, but I'm not sure exactly how I can get it to find the 6 - 9 characters, and then replace the two letters with a new string.

 

Any ideas?

Share this post


Link to post
Share on other sites

mmm been at the pub tonight .. too many pints to think straight .. but the is definitely a sed and regex thing.

 

shall call back here tomorrow

Share this post


Link to post
Share on other sites

OK, cool, hope the pub was good. Could do with some :beer: myself.

Share this post


Link to post
Share on other sites

mmmm

s/^*[0-9]{6,9}[a-zA-Z][a-zA-Z]*

 

match from the beginning anything followed by 0-9 a minimum of 6 but a maximum of 9 times, then match an alpha character, then another alpha character, then match anything

 

a replace might look like this:

s/^*[0-9]{6,9}[a-zA-Z][a-zA-Z]*/mytext/

 

something like that perhaps?

Share this post


Link to post
Share on other sites

Will have to try it. Problem is the text file has one single line of all these numbers separated by spaces. It's a pity each number isn't on a separate line, then it might be easier to parse the file. For example, it would look like this:

 

 

123456SA3456 098775443SA6666

 

and so on instead of having each number on a separate line. A colleague went and did it with perl, so I'll have to play with this with the file they sent me and see what I can do :)

 

I need to replace the SA in the middle.

Share this post


Link to post
Share on other sites

No, the spaces identify the next number in the file. Instead of each number being on a new line, they are separated by a space - probably by the system that generated the file. As far as I'm aware, the space is always the same - a single space.

Share this post


Link to post
Share on other sites

Depending on the length of line that is input a simple way of doing what you want is.

 

sed 's/ /\n/g'<input.txt >temp.txt #This will give you a list with newline at then end.

 

sed 's/[a-zA-Z][a-zA-Z]/-Replaced-/ 1' <temp.txt >out.txt #Job done!

 

Sounds a bit too simple for me! There must be a gotcha somewhere. :)

input.txt

temp.txt

out.txt

Share this post


Link to post
Share on other sites

I'm currently on hols at the minute but will have to give it a go. Just to clarify, is what you posted to replace the "SA" that is located somewhere in the middle? Because this is the only instance it can replace. It could appear in the last four characters and this one I wouldn't want to replace.

 

I won't be back in to check it until the end of the month :)

Share this post


Link to post
Share on other sites

The only instance replaced is the first instance in the string.

The magic is the '1' in the REGEX as in sed 's/[a-zA-Z][a-zA-Z]/-Replaced-/ 1' <temp.txt >out.txt

 

Also the single quotes are part of the expression.

Edited by SilverSurfer60

Share this post


Link to post
Share on other sites

Hi, no that didn't work. It replaced only the first instance in the file as everything is on one long line. I'll try creating carriage returns and test again.

Share this post


Link to post
Share on other sites

No, didn't work with that either. Worked better, but replaced something completely different than what was intended. It was meant to replace the first instance of SA, but other text exists, and it search and found any text. I need it to find just "SA" and replace this.

Share this post


Link to post
Share on other sites

There is text that appears before the first 6 to 9 numbers - this needs to be ignored.

It needs to locate the 6 - 9 numbers and replace the SA that follows this.

Therefore, any other instance of SA before these numbers is to be ignored and any SA that follows after the first instance of SA after the 6 - 9 numbers needs to be ignored.

Share this post


Link to post
Share on other sites

Not knowing exactly what your input is like I am guessing a little.

However I did paste the wrong Regex for the second sed.

 

The first is OK as it should split the input into a list of lines, each line with a \n where there is a space in the input.

 

That is sed 's/ /\n/g'<input.txt >temp

 

The second part should be sed 's/SA/-Replaced-/ 1' <temp >out.txt

Notice the absence of the 'g'

 

The '1' in the expression should replace the first occurrence of SA on each line.

 

Now if SA occurs before the 6-9 numbers that is a different situation.

Edited by SilverSurfer60

Share this post


Link to post
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
Sign in to follow this  

×
×
  • Create New...