ianw1974 Posted July 6, 2011 Report Share Posted July 6, 2011 Will be interesting to see if anyone knows how I can do this :) I need to replace some text, the first 6 - 9 characters are numbers. Then there are two letters, of which then follow four more characters which are a mix of numbers. There is a possibility that the two letters that proceed the first 6 - 9 characters could occur in the last four, so need to make sure I only replace the first instance that follows the first 6 - 9 characters. I'm thinking of sed, but I'm not sure exactly how I can get it to find the 6 - 9 characters, and then replace the two letters with a new string. Any ideas? Quote Link to comment Share on other sites More sharing options...
paul Posted July 6, 2011 Report Share Posted July 6, 2011 mmm been at the pub tonight .. too many pints to think straight .. but the is definitely a sed and regex thing. shall call back here tomorrow Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted July 6, 2011 Author Report Share Posted July 6, 2011 OK, cool, hope the pub was good. Could do with some myself. Quote Link to comment Share on other sites More sharing options...
paul Posted July 7, 2011 Report Share Posted July 7, 2011 mmmm s/^*[0-9]{6,9}[a-zA-Z][a-zA-Z]* match from the beginning anything followed by 0-9 a minimum of 6 but a maximum of 9 times, then match an alpha character, then another alpha character, then match anything a replace might look like this: s/^*[0-9]{6,9}[a-zA-Z][a-zA-Z]*/mytext/ something like that perhaps? Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted July 7, 2011 Author Report Share Posted July 7, 2011 Will have to try it. Problem is the text file has one single line of all these numbers separated by spaces. It's a pity each number isn't on a separate line, then it might be easier to parse the file. For example, it would look like this: 123456SA3456 098775443SA6666 and so on instead of having each number on a separate line. A colleague went and did it with perl, so I'll have to play with this with the file they sent me and see what I can do :) I need to replace the SA in the middle. Quote Link to comment Share on other sites More sharing options...
SilverSurfer60 Posted July 7, 2011 Report Share Posted July 7, 2011 Do you want to preserve the spaces? Are the number of spaces always the same? Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted July 8, 2011 Author Report Share Posted July 8, 2011 No, the spaces identify the next number in the file. Instead of each number being on a new line, they are separated by a space - probably by the system that generated the file. As far as I'm aware, the space is always the same - a single space. Quote Link to comment Share on other sites More sharing options...
SilverSurfer60 Posted July 8, 2011 Report Share Posted July 8, 2011 Depending on the length of line that is input a simple way of doing what you want is. sed 's/ /\n/g'<input.txt >temp.txt #This will give you a list with newline at then end. sed 's/[a-zA-Z][a-zA-Z]/-Replaced-/ 1' <temp.txt >out.txt #Job done! Sounds a bit too simple for me! There must be a gotcha somewhere. :) input.txt temp.txt out.txt Quote Link to comment Share on other sites More sharing options...
SilverSurfer60 Posted July 16, 2011 Report Share Posted July 16, 2011 Have you tried it Ian? If so did it work or was it totally off the mark? Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted July 16, 2011 Author Report Share Posted July 16, 2011 I'm currently on hols at the minute but will have to give it a go. Just to clarify, is what you posted to replace the "SA" that is located somewhere in the middle? Because this is the only instance it can replace. It could appear in the last four characters and this one I wouldn't want to replace. I won't be back in to check it until the end of the month :) Quote Link to comment Share on other sites More sharing options...
SilverSurfer60 Posted July 20, 2011 Report Share Posted July 20, 2011 (edited) The only instance replaced is the first instance in the string. The magic is the '1' in the REGEX as in sed 's/[a-zA-Z][a-zA-Z]/-Replaced-/ 1' <temp.txt >out.txt Also the single quotes are part of the expression. Edited July 20, 2011 by SilverSurfer60 Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted August 3, 2011 Author Report Share Posted August 3, 2011 Hi, no that didn't work. It replaced only the first instance in the file as everything is on one long line. I'll try creating carriage returns and test again. Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted August 3, 2011 Author Report Share Posted August 3, 2011 No, didn't work with that either. Worked better, but replaced something completely different than what was intended. It was meant to replace the first instance of SA, but other text exists, and it search and found any text. I need it to find just "SA" and replace this. Quote Link to comment Share on other sites More sharing options...
ianw1974 Posted August 3, 2011 Author Report Share Posted August 3, 2011 There is text that appears before the first 6 to 9 numbers - this needs to be ignored. It needs to locate the 6 - 9 numbers and replace the SA that follows this. Therefore, any other instance of SA before these numbers is to be ignored and any SA that follows after the first instance of SA after the 6 - 9 numbers needs to be ignored. Quote Link to comment Share on other sites More sharing options...
SilverSurfer60 Posted August 7, 2011 Report Share Posted August 7, 2011 (edited) Not knowing exactly what your input is like I am guessing a little. However I did paste the wrong Regex for the second sed. The first is OK as it should split the input into a list of lines, each line with a \n where there is a space in the input. That is sed 's/ /\n/g'<input.txt >temp The second part should be sed 's/SA/-Replaced-/ 1' <temp >out.txt Notice the absence of the 'g' The '1' in the expression should replace the first occurrence of SA on each line. Now if SA occurs before the 6-9 numbers that is a different situation. Edited August 7, 2011 by SilverSurfer60 Quote Link to comment Share on other sites More sharing options...
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.