Tag Archives: Programming

MacOS X and split

Split, on Mac OS X, doesn’t have the -d option to number files. This is a right royal pain when you are splitting up a dd image as I couldn’t figure out how to get either XWays Forensics or EnCase to accept the split image when suffixed with aaa, aab, aac etc. First time out of the gate I just paid a child to sit and re-number the lot for me ( which cost me £5 – but saved my sanity ), but for future reference and to save my financial status here is a (albeit long) one liner for the command line that will take any three letter suffixed filename & change it to the corresponding numerical value. (There are probably cleaner ways of doing this – feel free to let me know and I’ll be happy to update them here).

ls test.dd.* | awk 'BEGIN {FS="\\."}{print $3 ":" $0}' | 
awk -v FS="" '{ convert="abcdefghijklmnopqrstuvwxyz" } { first=index(convert,$1); 
second=index(convert,$2); third=index(convert,$3); 
printf "test.dd.%03d:", (third + (26 * ( second - 1 )) + ( 676 * ( first - 1 ))); print $0 }' | 
awk -v FS=":" '{print$3 " " $1}' | xargs -n 2 mv

Note that this has the output filename hard coded (test.dd.000). 

So this takes all the files test.dd.aaa, test.dd.aab, test.dd.aac etc. and converts them to test.dd.001, test.dd.002, test.dd.003. So, this will work for any number of files up-to and including zzz which is 17,576 – but extending it further wouldn’t be a particularly challenging task…

Tagged , , , , , ,

How to automate Twitter and make a bit of a tit of yourself at the same time …

Free twitter badge

Free twitter badge (Photo credit: Wikipedia)

Oh dear, oh dear – post in haste repent at leisure ! ( If you don’t know what I’m talking about – see here. ) I’m glad to say that I recently read a book1 on business that suggested that an Agile approach ( release early, release often, and fix your bugs as you go along ) was definitely  the way towards successful business, so I’m going to imply that I did it on purpose.

So what’s gone wrong ? I can see that the script ( twitter.py -r )is running fine from cron ( /var/log/cron – it appears to run every five minutes ) – I know that if I run it from the command line within 5 minutes of creating the schedule that it works, that implies that the logic in the program ( if badly written ) is at least ok … So where is the issue occurring ? I thought initially that it was a path problem – I guess that my fault so far is that I’ve not made any effort to capture any errors. Ok, so I’ll give that a go … Great, nada being reported by cron. That’s not helpful.

Ah hah ! Got an error at last.

tweepy.error.TweepError: Status is a duplicate.

Whilst I can’t find any specific references to the error, it seems to me to be quite self explanatory. You can’t keep re-tweeting the same message – it needs to differ. That explains why the HootSuite interface was such a pain in the neck as they offload this onto the user to populate their CSV file with. I guess that the outstanding question then is “How much does a Tweet need to differ by _not_ to be considered a duplicate ?” by definition this should be a single char, so, for my 24 scheduled tweets I need to create 24 unique chars to add to the tweet. The simplest way would be to either count up or count down, this would hopefully give sufficent change to be different, as well as indicating easily to me, if not the casual observer, how far through the re-Tweet lifecycle it currently is.

The code now reads as follows:

#!/usr/bin/env python
##########################
# Python Auto Re-Tweeter #
# (C) Simon Biles 2012   #
# http://www.biles.net   #
##########################
# Version 0.01 -         #
# A first stab at it !   #
##########################
# Version 0.02 -         #
# A working version !    #
##########################

# All those tasty Python imports
import argparse
import datetime
import struct
import sys
import tweepy

from ConfigParser import SafeConfigParser

# Get the command line arguments
parser = argparse.ArgumentParser(description='Regular Tweet Generator.')
parser.add_argument('-s','--schedule', action='store_true', help='Schedule a Tweet for the next 7 days')
parser.add_argument('-r', '--run', action='store_true', help='Run the schedule')
parser.add_argument('-u','--update', action='store_true', help='Update Status Tweet immediately')
parser.add_argument('tweet', nargs='?')
args = parser.parse_args();

# Global variable
time_fmt = "%Y-%m-%d %H:%M"

# Get the config file data
parser = SafeConfigParser()
parser.read('twitter.conf')
CONSUMER_KEY = parser.get('consumer_keys','CONSUMER_KEY')
CONSUMER_SECRET = parser.get('consumer_keys','CONSUMER_SECRET')
ACCESS_KEY = parser.get('access_keys','ACCESS_KEY')
ACCESS_SECRET = parser.get('access_keys','ACCESS_SECRET')
FILE_NAME = parser.get('file_name', 'SCHEDULE_FILE')

# Main body

# Quick Command Line Update
if args.run == False and args.schedule == False and args.update == True:
   auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
   auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
   api = tweepy.API(auth)
   api.update_status(sys.argv[1])
   sys.exit()
# Schedule a Tweet by adding it to the schedule file
elif args.run == False and args.schedule == True and args.update == False:
   file_obj = open(FILE_NAME, 'a')
   current = datetime.datetime.now()
   nexttweet = current;
   count = 0
   while (count < 24):
      diff = datetime.timedelta(hours=count)
      nexttweet = nexttweet + diff
      tweettime = nexttweet.strftime(time_fmt) + " " + str(count+1) + "/24 " + args.tweet +"\n"
      file_obj.write(tweettime)
      count = count + 1
   file_obj.close 
   sys.exit() 
# Parse the schedule file and see if anything should have happend within 5 minutes of now.
elif args.run == True and args.schedule == False and args.update == False:
   file_obj = open(FILE_NAME, 'r')
   auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
   auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
   api = tweepy.API(auth)
   current = datetime.datetime.now()
   baseformat = "16s 1x"
   for line in file_obj:
      line = line.rstrip('\n')
      numremain = len(line) - struct.calcsize(baseformat)
      lformat = "%s %ds" % (baseformat, numremain)
      tweettime, tweet = struct.unpack(lformat, line)
      linetime = datetime.datetime.strptime(tweettime, time_fmt)
      delta = linetime - current
      if delta <= datetime.timedelta(minutes=5) and delta >= datetime.timedelta(minutes=-5):
         if delta <= datetime.timedelta(minutes=5):
            api.update_status(tweet)
   file_obj.close
   sys.exit()

So there you have it, a working version ! I’ve watched 10/24 Tweets fly by over the weekend, and the other 14 will play out over the next week and a bit – I must admit though that it is a bit front loaded at the moment, and behaves a little “spamily” for my liking. I think before I unleash it again, I might start it off at 6 hour intervals and let it grow from there for 18 Tweets. I’m thinking of how to track it’s success, and I have an idea, but more of that later !

 


1. The book is question was ReWork: Change the Way You Work Forever. Which I rather enjoyed, it was short and to the point – I don’t think that it is necessarily a “how-to” guide, but did get me thinking about a few things and gave me some inspiration to go out on the web and make a tit of myself like this 😉

Tagged , , , ,

How to Automate Twitter – continued …

English: Python logo Deutsch: Python Logo

English: Python logo Deutsch: Python Logo (Photo credit: Wikipedia)

Pre-warning – I wrote this pretty late at night for me, and it doesn’t actually work at the moment – consider this an Agile release process …

Ok, so here is version 0.01 of the automated Twitter utility (I will be using it to re-publicise this blog entry, along with a couple of others – so this could either be a brilliant advert or a dire warning!1). I’ve changed the frequency criteria somewhat from the original, there are now 24 re-tweets with an hours increase in delay between each ( e.g. tweet, 1 hour, 2 hours, 3 hours up to and including 24 hours – overall this spreads out 24 tweets over 11 days or so ) – I’ll give that a go, and maybe experiment from there. I’ve also included a schedule file reference into the configuration file so that it is easier to change should it be neccesary. The overall lack of semi-colons and brackets has given me a nervous twitch, but other than that, given that I’m pretty new to Python, that was – all in all – not a terrible experience !

#!/usr/bin/env python

##########################
# Python Auto Re-Tweeter #
# (C) Simon Biles 2012   #
# http://www.biles.net   #
##########################
# Version 0.01 -         #
# A first stab at it !   #
##########################

# All those tasty Python imports

import argparse
import datetime
import struct
import sys
import tweepy
from ConfigParser import SafeConfigParser

# Get the command line arguments

parser = argparse.ArgumentParser(description='Regular Tweet Generator.')
parser.add_argument('-s','--schedule', action='store_true', help='Schedule a Tweet for the next 7 days')
parser.add_argument('-r', '--run', action='store_true', help='Run the schedule')
parser.add_argument('-u','--update', action='store_true', help='Update Status Tweet immediately')
parser.add_argument('tweet', nargs='?')
args = parser.parse_args();

# Global variable

time_fmt = "%Y-%m-%d %H:%M"

# Get the config file data

parser = SafeConfigParser()
parser.read('twitter.conf')
CONSUMER_KEY = parser.get('consumer_keys','CONSUMER_KEY')
CONSUMER_SECRET = parser.get('consumer_keys','CONSUMER_SECRET')
ACCESS_KEY = parser.get('access_keys','ACCESS_KEY')
ACCESS_SECRET = parser.get('access_keys','ACCESS_SECRET')
FILE_NAME = parser.get('file_name', 'SCHEDULE_FILE')

# Main body

# Quick Command Line Status Update

if args.run == False and args.schedule == False and args.update == True:
   auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
   auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
   api = tweepy.API(auth)
   api.update_status(sys.argv[1])
   sys.exit()

# Schedule a Tweet by adding it to the schedule file

elif args.run == False and args.schedule == True and args.update == False:
   file_obj = open(FILE_NAME, 'a')
   current = datetime.datetime.now()
   nexttweet = current;
   count = 0
   while (count < 24):
      diff = datetime.timedelta(hours=count)
      nexttweet = nexttweet + diff
      tweettime = nexttweet.strftime(time_fmt) + " " + args.tweet +"\n"
      file_obj.write(tweettime)
      count = count + 1
   file_obj.close 
   sys.exit()# Parse the schedule file and see if anything should have happend within 5 minutes of now.elif args.run == True and args.schedule == False and args.update == False:
   file_obj = open(FILE_NAME, 'r')
   auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
   auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
   api = tweepy.API(auth)
   current = datetime.datetime.now()
   baseformat = "16s 1x"
   for line in file_obj:
      line = line.rstrip('\n')
      numremain = len(line) - struct.calcsize(baseformat)
      lformat = "%s %ds" % (baseformat, numremain)
      tweettime, tweet = struct.unpack(lformat, line)
      linetime = datetime.datetime.strptime(tweettime, time_fmt)
      delta = linetime - current
      if delta <= datetime.timedelta(minutes=5) and delta >= datetime.timedelta(minutes=-5):
         api.update_status(tweet)
   file_obj.close
   sys.exit()

I think that it works ok, I’m going to carry out testing in live – like all good practice guides suggest you shouldn’t ! I’m going to set it to run with the -r command line switch from cron every five minutes.

There are one or two features that I think that I should look into in the near-ish future:

  • Cleaning up the schedule file – obviously it is only going to get longer and longer and thus the program will consume more and more resources as it tries to parse it.
  • I’d like to automate the script so that it monitors my Twitter account for updates from WordPress, and then adds those to the schedule immediately – I’m lazy you see …
  • I’m sure that there are a few better ways of doing things in there, and also there could be a little more in the way of commentary and instruction ( there are also, frankly, one or two bits that work, but that I don’t understand how ! )

Ah well, onward and upward eh 😉


1. Ok, it’s a dire warning … It worked on the command line, honest !

UPDATE: If you decide to go live and test there, you then spend time hurriedly chasing down the bugs in your code which you just posted on your blog so as not to look like a complete berk … You have been warned !

UPDATE 2: Hmmm … Not _actually_ working, that’s a bit lousy … Ah, wait. I think I know what the problem is ! You need to specify the full path to the file in the config as cron doesn’t run in the same path ! Right take 3 !

UPDATE 3: Ok, that didn’t work … Back to the drawing board. Unfortunately I don’t have time now 😦 so I’ll have to come back in another post later … Arrrrggghhh !

Tagged , , , , , , , , ,

How to Automate Twitter – a bit at least !

Perl

Perl (Photo credit: Wikipedia)

I’ve been trying to push up the readership of the blog here ( and get some people to stick around a bit, subscribe, follow on twitter etc. )  I’m not a Facebooker – I do have an account ( or two … ) but they contain nothing much of interest, they were created in order to investigate how FB worked, rather than anything else, so I’m not exactly the stereotypical user ! I make use of LinkedIn and Twitter as my online “social” tools and I’ve not graduated beyond that. The trouble is, I believe, in the transient nature of Twitter – I Tweet and it disappears off the bottom of the screen in seconds as other’s posts come in and push it down. I’ve watched for a while, and it seems that the “successful” Tweeters post their links frequently – keeping them in view for a longer period of time.

Now, I have to admit that I am lazy, but also geeky – I want to post a tweet advertising the blog frequently, but without user interaction. I’m sure that people will pop-up and tell me of things that automagically do this for me – HootSuite springs to mind – but having used it, it has already upset me with it’s scheduling system – the CSV upload is a pain, and, as of yet, I’ve not managed a single one without an error. Sooo, as I spent a while ago messing around with Twitter and Perl, I thought that the easiest way forward might just be to write my own.

For want of a better methodology, as I intend to post once a week, I want each entry alerted on immediately, and then in increasing intervals until the next post is due out ( 1 week hence ). I don’t want mid-term posts to reset the last weeks worth, but if it is relevant ( like I hope all posts are !) then I do want to publicise it for a full week as well. I’ve tried a couple of exponential increases, (2 * last period, 1.5 * last period ), but to be honest as I’m sure you can imagine, it gets up to over a day fairly quickly … (Google “exponential” if you want to know more !)  So I’m going to say day one is once every two hours, day two is once every three hours, day three once every four hours, day four is four times in the day, day five is three times a day, day six is twice, and just the once on the seventh day – heck, if God can take a rest, so can our program ! That gives us a total of 36 Tweets, weighted towards the start whilst the post is fresh and tailing off as the new post comes along.

As always stated with my programming posts, I’m not a programmer, any similarity to programmers living or dead is entirely coincidental. I like programming in Perl, because, not only is “there more than one way to do it”, I can usually figure out at least one of those particular permutations – elegant as my solution may not be … [ if you want to see elegant programming – and the output of the man that I go to when I get stuck – have a look over here. Shamefully he wastes his time in the world of Microsoft, but we forgive him a lot 😉 ]

It turns out, much to my annoyance that the authentication methods that I was using in the “Hacking around with Twitter” is no longer valid. It seems that I now need to use OAuth1 … However, after several hours of buggering around with it I failed completely to get it to work. So back to the drawing board there …

Python anybody ?

English: Python logo Deutsch: Python Logo

English: Python logo Deutsch: Python Logo (Photo credit: Wikipedia)

I’ve been meaning to get cracking with Python for some time. I was a die hard Perl fan until the day I saw the graphs that came from matplotlib – I was taken by the quality and professionalism of them, and I immediately spent far more money than can be considered sensible on all sorts of Python books so that I too, could make maths and art become one and the same thing. It seems though that I have the same level of programming ability as a garden slug when it comes to moving languages, and the same sort of speed of movement. It took me three years (ish) at university to learn C [ and ML and Prolog – but let’s be honest, neither of those actually count as programming languages ] and it’s taken me countless years since to learn to threaten, coerce and cajole Perl to do my bidding at least 50% of the time.

This, then, is my forced introduction to Python – my baptism of fire ( although God only knows why, if I can’t do it in Perl I stand the least bit of chance in Python ! ). And, not only that, I’m going to push it out here for your ridicule and derision.

Another day, I’d like to walk through the Rackspace cloud with you, but that’s for another day – let us just say, that I quickly threw up an Fedora 15 (Lovelock) instance to play with, and was deeply relieved that Python appears to be a standard part of the distribution. For reference my development environment also consists of Komodo Edit, which is excellent, with supported syntax highlighting for both Perl and Python ( and HTML and C and C++ and … ) also, when correctly configured, is quite happy using scp to remotely edit files and browse remote directories.

I understand that the Python equivalent of CPAN is PyPI – the Python Package Index – and, after installing the package, I’ve used that to install the Tweepy library. I’m not going to repeat the guidance on creating a new application in either (both!) of the blog links below – what I will say though is that you should remember to set your application settings to Read and Write – otherwise it won’t work 😉2

I’ve split the examples out so that there is a config file that holds the various keys. It’s format is as follows:

[consumer_keys]
CONSUMER_KEY = consumer_key_here
CONSUMER_SECRET = consumer_secret_here
[access_keys]
ACCESS_KEY = access_key_here
ACCESS_SECRET = access_secret_here

Obviously insert your own, hard earned keys in here – no inverted commas or anything they get parsed in a minute with ConfigParser. [ Basically, I couldn’t go through the rest of this worrying about accidentally publishing my keys every five minutes. ]. I used the script provided in the example to do this, although it seems that you can generate these keys for your own Twitter account in the developer section of the site without going through the pain or the learning experience.

I’m getting worried how long this post is getting – especially after a discussion with a young man the other day who said that his dissertation was 5000 words only and I’ve written a 5th of that ! – so below is the remainder of the sample code for a command line client, this takes text after the command ( contained in ‘ ‘ ) and updates your status with it ( e.g. ./twitter.py ‘It lives!’ ):

#!/usr/bin/env python

import sys
import tweepy
from ConfigParser import SafeConfigParser

parser = SafeConfigParser()
parser.read('twitter.conf')

CONSUMER_KEY = parser.get('consumer_keys','CONSUMER_KEY')
CONSUMER_SECRET = parser.get('consumer_keys','CONSUMER_SECRET')
ACCESS_KEY = parser.get('access_keys','ACCESS_KEY')
ACCESS_SECRET = parser.get('access_keys','ACCESS_SECRET')

auth = tweepy.OAuthHandler(CONSUMER_KEY, CONSUMER_SECRET)
auth.set_access_token(ACCESS_KEY, ACCESS_SECRET)
api = tweepy.API(auth)api.update_status(sys.argv[1])

I’ll write a second post within the next week to update the remainder with a full program to automate the remainder of the posting process – I want to get it running asap to be honest, as I think I’m missing out !


1. With thanks to David Moreno’s blog post on the issue as my starting point on OAuth for Perl, and perhaps the first and last bit of it that I understood ! And Jeff Miller’s blog post for the Python equivalent.
2. Which may well be why I couldn’t get the darn Perl version to work, I realise now. However, a kick in the pants, is a kick in the pants for whatever reason it comes …

Tagged , , , , , , , , , , ,