Jump to content

Script to update names (Hey aru !!)))


Guest anon
 Share

Recommended Posts

This should be a good test for our script guru aru :P :P

OK, here's the problem, i have several rsync jobs running on our server updating files from several sites. Problem is, when part of the file name changes say "XFree86-4.3-24.4.92mdk.i586.rpm" gets updated to "XFree86-4.3-24.4.94mdk.i586.rpm" rsync see's the file as a new file and so downloads the whole file. With several thousand files updated, this takes up lots of bandwith .

The trick around this problem is easy with large files like ISOs, you simply change the old file name to the new version, then run rsync, so only the changes are downloaded. But with hundreds or thousands of files, it would take weeks to rename them all.

So, is it possible to write a script that will check the names of files on a remote server, and if the last part of the file name has changed, (xfree-17 is now xfree18) get the script to update the names on the recieving end??

Is it possible aru?

Link to comment
Share on other sites

  • Replies 43
  • Created
  • Last Reply

Top Posters In This Topic

It is. :P

 

It's always possible, an rsync command w/o a destiny part dumps in stdout the list of the remote directory, it is only a 'simple' matter of comparing filenames and rename the candidates. I must to say that yours is a great idea. The only problem I see is that I have to study rsync to see if its algorithm allows that and that it won't break the files. Let me a couple of days :)

Edited by aru
Link to comment
Share on other sites

It does allow it, and it works. If you have say MDK-9.2-ISO and you rename it to MDK-10-CE-ISO then run rsync, it will update it perfectly without problems. B)

Link to comment
Share on other sites

gAru, i think this is somewhat similar to the rpm update script that you gave me before. at least i think so that itself entails a lot of salt. :twisted:

 

ciao!

Yes, I was thinking on that when I said it was possible. Good memory ramfree! at least seems that my scripts interest some one (even the one that KILLED my bash tips & tricks thread!!!) :wall: :P

 

Now I have to have some spare time to write it. I'll post here each version I have to see if some one can test (I'm connecting through a plain modem and I can't do 'real time' tests with any rsync mirror) or help to improve the code.

 

Soon to come the rsyncOptimum bash script prototype (I accept name suggestions).

 

Here are some of the requisites the script must have:

  • Recursiveness: it should process each directory of the server, though it is not mandatory since the main interest should be the RPMS and SRPMS directories.
  • Check which are the diferences between the server and the local mirror
  • Must ensure that the difference is in the version and only in the version of the file and not anywhere else.
  • Rename the files with difference in the version number
  • Launch the real rsync command.

 

If I miss something or you see it from a different point of view, please say it ;)

Link to comment
Share on other sites

Anon: how do you call your rsync commands, for example the script might be called with the same syntax and then strip the remote rsync server and the local directory from the commandline and later call the real rsync proccess in a clean way.

 

For example, I call rsync as follows:

~$ rsync -av --partial --delete --exclude '/.*'  rsync://carroll.cac.psu.edu/mandrake/updates/9.1 /var/ftp/mandrake/updates

 

Thus the rsyncOptimum (I'm still accepting names) might be called with the same syntax and strip the parameters it will need from command line regexps, then at the final stage it can launch rsync as usual with the full command line parameters.

 

Show me some examples of your rsync commands

Edited by aru
Link to comment
Share on other sites

This command (unoptimized) should show differences between local listings and remote rsync listings:

 

~$ LOCAL_DIR='/var/ftp/mandrake/updates/'
~$ RSYNC_DIR='rsync://carroll.cac.psu.edu/mandrake/updates/9.1'
~$ diff <(rsync -av ${RSYNC_DIR} | sed -n '/^receiving/,/^wrote / {/^receiving\|^wrote /d;p}' | awk '{print $NF}' | sort) <(find ${LOCAL_DIR} -name '*' | sed -n "s@${LOCAL_DIR}@@gp" | sort)

 

Could any of you check if it works or not. It is not a robust command, so please check it adding or removing in any combination the last '/' on $LOCAL_DIR path (probably here doesn't mind) and $RSYNC_DIR path (maybe here the slash might cause troubles). This isn't a harmful command, so test it anytime you want. (ofcourse a local rsync mirror is mandatory ;) )

 

Once I'll be sure that this diff part works it will be trivial to adapt my mdkupdate script to "rename files and then use rsync" instead of using wget

 

Now I have to go. Maybe later I'll resume this. I'm waiting for your input :jester:

Link to comment
Share on other sites

This is what i get from that command after altering the dir:

[root@dp-5003 root]# LOCAL_DIR='/var/ftp/pub/Mandrake/Updates/9.2'
[root@dp-5003 root]#  RSYNC_DIR='rsync://carroll.cac.psu.edu/mandrake/updates/9.2'
[root@dp-5003 root]#  diff <(rsync -av ${RSYNC_DIR} | sed -n '/^receiving/,/^wrote / {/^receiving\|^wrote /d;p}' | awk '{print $NF}' | sort) <(find ${LOCAL_DIR} -name '*' | sed -n "s@${LOCAL_DIR}@@gp" | sort)
sed: -e expression #1, char 49: Extra characters after command
0a1,237
>
> /apache-1.3.28-3.1.92mdk.i586.rpm
> /apache2-2.0.47-6.3.92mdk.i586.rpm
> /apache2-common-2.0.47-6.3.92mdk.i586.rpm
bla bla bla

Link to comment
Share on other sites

Here is the first prototype. I can't test if it works of not, so I need some feed back in order to continue.

 

In the comments within the code is explained what it does. As is written now, it will only show what has to be renamed, but won't rename anything hence it is safe for testing.

 

change LOCAL_DIR and RSYNC_DIR with real values.

 

 

#! /bin/bash

                                                                                                                                                           

# $Id: rsyncOptimum,v 1.1 2004/03/27 11:15:39 aru Exp aru $

# PROTOTYPE :: FOR TESTING PURPOSES

                                                                                                                                                           

# Main variables; in the final version most will be either stripped from the

# command line or from enviromental variables:

LOCAL_DIR='/var/ftp/mandrake/updates/'

RSYNC_DIR='rsync://carroll.cac.psu.edu/mandrake/updates/9.1'

                                                                                                                                                           

# Listings, maybe I'll put this in temp files instead of variables.

LOCAL_LISTING=$(find ${LOCAL_DIR} -name '*' | sed -n "s@${LOCAL_DIR}@@gp" | sort)

RSYNC_LISTING=$(rsync -av ${RSYNC_DIR} | sed -n '/^receiving/,/^wrote / {/^receiving\|^wrote /d;p}' | awk '{print $NF}' | sort)

                                                                                                                                                           

# Lets see if there are differences between both listings, we'll only store the

# filenames that are on the remote listing, later we'll see if in the local

# list are candidates for being renamed:

updates="$(diff <(echo "${RSYNC_LISTING}") <(echo "${LOCAL_LISTING}") | sed -n '/^</ {s|^< ||p;}')"

                                                                                                                                                           

# Checking if the updates are new versions or not, if they are, then change the

# name of the local old version to match the new version one:

if [ "${updates}" != "" ]; then

    for file in ${updates}; do

        local_file=$(echo "${LOCAL_LISTING}" | egrep "^${file%%-[0-9]*}-[0-9]")

        if [ "${local_file}" != "" ]; then

            # REMOVE the "echo" part to really rename the file:

            echo mv ${local_file} ${file}

        fi

    done

fi

                                                                                                                                                           

                                                                                                                                                           

                                                                                                                                                           

 

waiting for your feedback

Link to comment
Share on other sites

This is what i get from that command after altering the dir:

[root@dp-5003 root]# LOCAL_DIR='/var/ftp/pub/Mandrake/Updates/9.2'
[root@dp-5003 root]#  RSYNC_DIR='rsync://carroll.cac.psu.edu/mandrake/updates/9.2'
[root@dp-5003 root]#  diff <(rsync -av ${RSYNC_DIR} | sed -n '/^receiving/,/^wrote / {/^receiving\|^wrote /d;p}' | awk '{print $NF}' | sort) <(find ${LOCAL_DIR} -name '*' | sed -n "s@${LOCAL_DIR}@@gp" | sort)
sed: -e expression #1, char 49: Extra characters after command
0a1,237
>
> /apache-1.3.28-3.1.92mdk.i586.rpm
> /apache2-2.0.47-6.3.92mdk.i586.rpm
> /apache2-common-2.0.47-6.3.92mdk.i586.rpm
bla bla bla

That's because of your sed version, append a ';' at the end of the sed commands.

 

in sed -n '/^receiving/,/^wrote / {/^receiving\|^wrote /d;p}' put sed -n '/^receiving/,/^wrote / {/^receiving\|^wrote /d;p;}'

 

 

and test the command again.

 

If it works, modify the prototype script in the same way and tell me. :)

Edited by aru
Link to comment
Share on other sites

OK, i altered the sed and it fixed the problem B)

The files in MDK-9.2 were already up to date, so i changed one of the apache files from xxx.i586rpm to i686rpm to check the script, but no change was listed?

[root@dp-5003 root]# #! /bin/bash
[root@dp-5003 root]#
[root@dp-5003 root]# # $Id: rsyncOptimum,v 1.1 2004/03/27 11:15:39 aru Exp aru $
[root@dp-5003 root]# # PROTOTYPE :: FOR TESTING PURPOSES
[root@dp-5003 root]#
[root@dp-5003 root]# # Main variables; in the final version most will be either stripped from the
[root@dp-5003 root]# # command line or from enviromental variables:
[root@dp-5003 root]# LOCAL_DIR='/var/ftp/pub/Mandrake/Updates/9.2'
[root@dp-5003 root]# RSYNC_DIR='rsync://carroll.cac.psu.edu/mandrake/updates/9.2'
[root@dp-5003 root]#
[root@dp-5003 root]# # Listings, maybe I'll put this in temp files instead of variables.
[root@dp-5003 root]# LOCAL_LISTING=$(find ${LOCAL_DIR} -name '*' | sed -n "s@${LOCAL_DIR}@@gp" | sort)
[root@dp-5003 root]# RSYNC_LISTING=$(rsync -av ${RSYNC_DIR} | sed -n '/^receiving/,/^wrote / {/^receiving\|^wrote /d;p;}' | awk '{print $NF}' | sort)
[root@dp-5003 root]#
[root@dp-5003 root]# # Lets see if there are differences between both listings, we'll only store the
[root@dp-5003 root]# # filenames that are on the remote listing, later we'll see if in the local
[root@dp-5003 root]# # list are candidates for being renamed:
[root@dp-5003 root]# updates="$(diff <(echo "${RSYNC_LISTING}") <(echo "${LOCAL_LISTING}") | sed -n '/^</ {s|^< ||p;}')"
[root@dp-5003 root]#
[root@dp-5003 root]# # Checking if the updates are new versions or not, if they are, then change the
[root@dp-5003 root]# # name of the local old version to match the new version one:
[root@dp-5003 root]# if [ "${updates}" != "" ]; then
>     for file in ${updates}; do
>         local_file=$(echo "${LOCAL_LISTING}" | egrep "^${file%%-[0-9]*}-[0-9]")
>         if [ "${local_file}" != "" ]; then
>             # REMOVE the "echo" part to really rename the
>             echo mv ${local_file} ${file}
>         fi
>     done
> fi
[root@dp-5003 root]#

Link to comment
Share on other sites

Well, the code was thought to be a script, hence thought to be copied first into a file (named rsyncOptimum) and run from it. :P That code was supposed to be the first and the prototype version of the script!! :deal:

 

Ok, if you want to run the code directly on command line, do some checks on each step. For example echo the value of each variable after each assignment, so I can see what is going wrong.

 

I'll do some tests here as I've an idea of how to do 'real time tests' with my modem,

:jester:

 

Hope I'll find what is going wrong.

Link to comment
Share on other sites

Now I have my own mirror :P

 

~$ RSYNC_DIR='rsync://carroll.cac.psu.edu/mandrake/updates/9.1'
~$ RSYNC_LISTING=$(rsync -av ${RSYNC_DIR} | sed -n '/^receiving/,/^wrote / {/^receiving\|^wrote /d;p}' | awk '{print $NF}' | sort)
~$ for file in $RSYNC_LISTING; do dirname $file; done | uniq | xargs mkdir -p
~$ for file in $RSYNC_LISTING; do touch $file; done
~$ tree 
.
`-- 9.1
   |-- RPMS
   |   |-- BitchX-1.0-0.c19.4.1mdk.i586.rpm
   |   |-- MySQL-4.0.11a-5.1mdk.i586.rpm
<...>
   |-- descriptions
   |-- ls-lR
   `-- md5sums

4 directories, 413 files
~$

 

Ofcourse are empty files, but for testing are more than enough :mr-green:

Link to comment
Share on other sites

OK, i altered the sed and it fixed the problem B)

The files in MDK-9.2 were already up to date, so i changed one of the apache files from xxx.i586rpm to i686rpm to check the script, but no change was listed?

....
[root@dp-5003 root]# # command line or from enviromental variables:
[root@dp-5003 root]# LOCAL_DIR='/var/ftp/pub/Mandrake/Updates/9.2'
[root@dp-5003 root]# RSYNC_DIR='rsync://carroll.cac.psu.edu/mandrake/updates/9.2'
[root@dp-5003 root]#
...

 

What about if you try:

LOCAL_DIR='/var/ftp/pub/Mandrake/Updates/

 

 

See my example, I wrote my code with that in mind ".../updates/" including the final slash instead of ".../updates/9.1". At least here works:

 

~$ #! /bin/bash
~$
   
~$ # $Id: rsyncOptimum,v 1.1 2004/03/27 11:15:39 aru Exp aru $
~$ # PROTOTYPE :: FOR TESTING PURPOSES
~$
   
~$ # Main variables; in the final version most will be either stripped from the
~$ # command line or from enviromental variables:
~$ LOCAL_DIR='/home/arusabal/TEST/updates/'
~$ RSYNC_DIR='rsync://carroll.cac.psu.edu/mandrake/updates/9.1'
~$
   
~$ # Listings, maybe I'll put this in temp files instead of variables.
~$ LOCAL_LISTING=$(find ${LOCAL_DIR} -name '*' | sed -n "s@${LOCAL_DIR}@@gp" | sort)
~$ RSYNC_LISTING=$(rsync -av ${RSYNC_DIR} | sed -n '/^receiving/,/^wrote / {/^receiving\|^wrote /d;p}' | awk '{print $NF}' | sort)
~$
   
~$ # Lets see if there are differences between both listings, we'll only store the
~$ # filenames that are on the remote listing, later we'll see if in the local
~$ # list are candidates for being renamed:
~$ updates="$(diff <(echo "${RSYNC_LISTING}") <(echo "${LOCAL_LISTING}") | sed -n '/^</ {s|^< ||p;}')"
~$
   
~$ # Checking if the updates are new versions or not, if they are, then change the
~$ # name of the local old version to match the new version one:
~$ if [ "${updates}" != "" ]; then
>     for file in ${updates}; do
>         local_file=$(echo "${LOCAL_LISTING}" | egrep "^${file%%-[0-9]*}-[0-9]")
>         if [ "${local_file}" != "" ]; then
>             # REMOVE the "echo" part to really rename the file:
>             echo mv ${local_file} ${file}
>         fi
>     done
> fi
mv 9.1/RPMS/apache2-manual-2.0.47-1.6.80mdk.i586.rpm 9.1/RPMS/apache2-manual-2.0.47-1.6.91mdk.i586.rpm
~$

 

see the output, I changed the version of the apache2-manual package and the script noticed it.

 

Now I have to go, tomorrow will be another day.

Edited by aru
Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.
Note: Your post will require moderator approval before it will be visible.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

Loading...
 Share


×
×
  • Create New...