-
02-02-05 #1
data muncher
- Join Date
- Sep 2004
- Location
- Berlin
- Posts
- 2,477
- Thanks
- 0
- Thanked 0 Times in 0 Posts
affiliate feed middleware
Ok after much thought i am going to start to make a public version of the import scripts and analysers that i have already made for one of my sites. Unfortunately there will have to be some kind of income from this whether it be by donation or by signup/subscriptions fees purely because the amount of traffic that will be used if every affiliate and his dog started using it, my first question is will this even be viable, would anyone actually use the damn thing and secondly have i missed anything. The site will work like this.
First of all i will Beta test it using affiliatewindow i guess, affiliate will be able to enter the merchant id, his username and password for the feed which in turn will then connect to the affiliate window server. As the affiliate himself is connecting to the server through our site i cant see any problems with it, the server will then download the feed, although the user will have the option to upload the feed via ftp if he wishes.
The site will then extract the data, and start the process list:
1:determine if ean is reliable for each row, comparing ean results to product name and allocating a feed reliability factor for ean in percentage.
2: Look at product name and check the remainer of the feed for duplicate entries, further anlaysing the product name to see if it does exist twice that the product description and ean etc are the same to confirm it is a genuine duplicate entry and delete the extra ones if there are
3: Cross reference the products name with the google adwords, overture and yahoo keyword suggestion tools for the products name to give affiliates a list of keyword alternatives to bid on if they want to try for ppc campaigns or seo work
4: Check for any pricing symbols inside the product name or description, delete product entry if there is.
5: Check for html in the product name or description, if there is delete the entry
6: Check image exists, collect image from the merchants site, resize it to a predetermined size and further reprocess it to make a thumbnail to guarantee all images are the same size. Save images on the server and rename all the file paths for images to lookup on our server. If there is no image user will have the option to insert his own image for all missing ones, or delete any rows that have no image.
7: Redirect check and error 404 pages, script will visit each link and monitor for redirects and 404 pages, if either occurs then it will delete that row of data from the feed
8: When visiting the site it will look for the price on the server, if the same price was not found in the html then it will drop the row of data. this might be something of an issue as if the page contains prices of other products and our script just finds £9.99 in any of the page it will treat the page as a valid price. Also i have seen one retailer that uses images for prices. We have used this and it works well, so for the lost listings we do have we find the ones we keep are more reliable.
9: Whilst at the merchants site it will check the site against wc3 standards as well, not so much for any real practicle results, however well designed sites without too much technical error give affiliates a general idea of the sites quality, and therefore it will issue a percentage rating for that site as well, useless information for most but can be used at a glance.
10: compare the merchants site with the alexa traffic guide so that it can give you a really rough idea of what the merchants traffic is like. No point marketing a site that has zero traffic anyway, its an inidication that the site isnt that great to begin with, although people can just use it as a guide just to see
11: add a field to the end of the feed (optional) to determine the page rank for each deeplink of the site
12: Check for descriptions, if no description exists user will have the option to keep the existing product or delete the ones without descriptions, or add there own " i am sorry there is currently no description for this item" tag or something.
This is the checklist of things the process must do, to clean one feed of 50,000 products we expect the server to take an hour to complete, the affiliate will then get an email telling him it is ready to download from the server and also a second csv with all the products/rows that were dropped and the reasons why, which can be emailed to the merchant if they want to get them to put some more effort into the feed.
As well as this other statistical information will be emailed to the affiliate such as feed quality in percentage. if we took a feed of 100,000 products and 50,000 of them were rubbish then the feed would have a quality of 50% very poor, but i think this scoring would be a reflection of the merchant and its affiliate offering. Obviously no merchant wants to be labelled as poor. Once beta testing had been created for the one network we could then create a pattern for other networks.
The problem i see with all of this is just the pure amount of traffic making something like this, i am not interested in the money from any kind of profit point of view, more as a demonstration of how good our services are to networks and affiliates and get work off the back of it, but i would like to know what else can be incorporated into this third party service whilst in checks the feeds and also if once it was made people are actually even going to use it.
I appreciate the networks will follow suit with some of the criteria i have mentioned above, but there will always be issues from an affiliates perspective that we can check for that the networks wont want to spend money on incorporating it from their point of view, so at least the service can always stay ahead of the game and i think push merchants and the networks harder at sorting the feeds as once reports start floating around the net they are going to want to be quick to rectify the bad publicity?
Your thoughts ladies and gentlemen please ;-)Nothing to see here...
-
02-02-05 #2
data muncher
- Join Date
- Sep 2004
- Location
- Berlin
- Posts
- 2,477
- Thanks
- 0
- Thanked 0 Times in 0 Posts
forgot the last bit, the user interface would allow the affiliate to store his or her feed import profile for each feed and if they want set cron jobs so it imports at a frequency they want, daily, weekly and also if they specify an ftp directory could have the file delivered to them directly overnight so that their own automatic scripts could just use them on their servers. for that matter smaller files can be sent by email.
Also an extra few export options could be made such as converting the file to an xml, rss or any other formatNothing to see here...
-
02-02-05 #3
Registered User
- Join Date
- Aug 2003
- Location
- Cheshire
- Posts
- 263
- Thanks
- 0
- Thanked 1 Time in 1 Post
If you're planning on retrieving images and checking for 404s etc, then 50,000 products in 1 hour is WAY too optimistic. Thats more than 10 requests a second.
If I was a merchant and you did this to my site I'd have your IP address blocked.
You need to get a pause in between each request to the merchants site, otherwise you're asking for trouble.
Interesting idea though!
Jon
-
02-02-05 #4
data muncher
- Join Date
- Sep 2004
- Location
- Berlin
- Posts
- 2,477
- Thanks
- 0
- Thanked 0 Times in 0 Posts
yes i am sure its more than that, i dont see any reason why we cant run up to 100 threads at a time. I can see your point as a merchant but also once the agent was identified as an affiliate tool i cant see why the merchant would want to ban the ip address? this will always be static and can be identified as an affiliate server tool.
Seperately of the same file is requested to be processed for the same merchant then the script wouldnt have to process your site again, so your maximum amount is once per day, which isnt really any greater than a search engine.
I am sure there will be people that ban it at first, but i cant see why a merchant would tell an affiliate that he won't allow them to use tools that make their job easier.
This is only my thought, i really do need you to tell me if you would still object to it on that basis? if so then i need to reconsider it and work on a way to get over it. either way a proxy service with mulitple ip addresses would get over any problems, it would be too hard for a merchant to identify every single ip address or range.Nothing to see here...
-
02-02-05 #5
Registered User
- Join Date
- Aug 2003
- Location
- Cheshire
- Posts
- 263
- Thanks
- 0
- Thanked 1 Time in 1 Post
I can think of several reasons, e.g. bandwidth costs money, you're going to slow their system down to the detriment of online users buying stuff. Remember as well that not all merchants maintain their own servers, and it may be left to a hosting company to block IPs.Originally posted by pricethat
yes i am sure its more than that, i dont see any reason why we cant run up to 100 threads at a time. I can see your point as a merchant but also once the agent was identified as an affiliate tool i cant see why the merchant would want to ban the ip address? this will always be static and can be identified as an affiliate server tool.
Most search engines (at least on our sites) leave at least 15 to 20 seconds between requests, which is much less than the frequency you are proposing.Seperately of the same file is requested to be processed for the same merchant then the script wouldnt have to process your site again, so your maximum amount is once per day, which isnt really any greater than a search engine.
Its going to look like a denial of service attackThis is only my thought, i really do need you to tell me if you would still object to it on that basis? if so then i need to reconsider it and work on a way to get over it. either way a proxy service with mulitple ip addresses would get over any problems, it would be too hard for a merchant to identify every single ip address or range.
Jon
-
02-02-05 #6
data muncher
- Join Date
- Sep 2004
- Location
- Berlin
- Posts
- 2,477
- Thanks
- 0
- Thanked 0 Times in 0 Posts
always wanted to do one of those
compromise, run the scripts at a keen level overnight for all the cron job customers. I would have to look at it, but if a site isnt capable of handling 10 requests per second then thats got to also be an indication of the site quality eh?
Information that the affiliate would want? Of course your point is a very very valid one, but when your running a scheme like a data feed deep link affiliate scheme then you would of thought merchants would want to make sure that there site can handle such a scheme. If a tool like this is needed to verify data as part of that scheme then maybe they should accomodate this?
Any merchants out there that wouldnt put up with 10 requests per second for an hour per day? bearing in mind that would be for 50,000 products type sites, if they are that big and cant handle that kind of queries with ease then they need to rethink anyway eh?
Small sites with 2-300 products could have slower queries, the important thing is that if an affiliate is going to use this then they are going to want to get the results as fast as possible.
I presume such a scheme is needed due to the amount of posts that you see on here all the time.Nothing to see here...
-
02-02-05 #7
data muncher
- Join Date
- Sep 2004
- Location
- Berlin
- Posts
- 2,477
- Thanks
- 0
- Thanked 0 Times in 0 Posts
I think this issue can be addressed with server response time, measuring the response speeds by the initial speed and then increasing threads every 10 seconds or so until the response changes to a degree in set parameters. This would balance the load to suit all servers then.
First we would have to identify the merchant as being happy with this. If the networks made people aware that such programs may exist and maybe an optin email from the merchant to allow this, would this address the issue as far as you are concerned ?
I know there is the issue of bandwidth to consider, but at the same time, the sales increase by more targeted effective affiliate schemes would outway this, the amount of bandwith we would use would only be as much as a search engine, and no one seems to object to them?Nothing to see here...
-
02-02-05 #8
Registered User
- Join Date
- Oct 2003
- Posts
- 97
- Thanks
- 0
- Thanked 0 Times in 0 Posts
why not write it as a desktop tool and then let people download it it. That way you don't have issues with your own server and merchants wouldn't be able to block all the IP's so easily.
Of course, affiliate window should be doing this at source and sorting the oh so many merchants who have such crap feeds.
-
02-02-05 #9
data muncher
- Join Date
- Sep 2004
- Location
- Berlin
- Posts
- 2,477
- Thanks
- 0
- Thanked 0 Times in 0 Posts
70% of the scripts have already been made because i had to make them for my own site so it would be much more to make it a windows application, but there are other reasons.
Firstly if its stand alone then you might get 50 different affiliates all processing the same affiliate feed which would kill a merchants site. Once a feed has been processed once it doesnt need to be processed again for however many people use it.
Secondly images can be resized properly and thumbnailed perfectly and stored on the server, changing the data feed to link to our server for the images which of course will be a lot more standard than the merchants.
Thirdly the server internet connection from the server is faster than the standard adsl people have
Fourth and probably the most important of them all, the server we use is 2.8GHZ with 2GB ram, this would probably require 80GB of total storage, maybe we would even need two servers doing the same. I dont know of anyone that has that kind of resources, anything less and it will just die, would probably crash windows at the speed it needs to run to be "useful"
Any suggestions how it could be made using windows would be useful, maybe a bit of server and client side would work wellNothing to see here...
-
02-02-05 #10
Dark Prince
- Join Date
- Aug 2003
- Location
- Behind you
- Posts
- 1,688
- Thanks
- 4
- Thanked 16 Times in 14 Posts
I agree with all the other points but my desktop would eat that for breakfast.Fourth and probably the most important of them all, the server we use is 2.8GHZ with 2GB ram, this would probably require 80GB of total storage, maybe we would even need two servers doing the same. I dont know of anyone that has that kind of resources
It sounds like an interesting service that you are proposing. I'd be interested to see what prices you proposed, but I think in the long run I'd want a script like that on my own server.
-
02-02-05 #11
Banned
- Join Date
- Nov 2003
- Location
- Bucharest, Romania
- Posts
- 2,684
- Thanks
- 0
- Thanked 0 Times in 0 Posts
The trouble is, I doubt the networks realise that if they put a healthy guide together on how to use their services, they'd have more take up. Someone in each network clearly doesn't have the brain cell to figure that out though....
Mind you, it seems to have taken several years for TradeDoubler to put a decent FAQ together... so don't expect anything soon. Hopefully someone at TD or elsewhere will help my braincells figure it out some time soon.Last edited by Lee_Owen; 02-02-05 at 09:42 PM.
-
02-02-05 #12
data muncher
- Join Date
- Sep 2004
- Location
- Berlin
- Posts
- 2,477
- Thanks
- 0
- Thanked 0 Times in 0 Posts
I'm not sure what a guide would do in this case lee, the problem is for those who do know how to use the feed but are presented with a new set of problems once using them, none of which a user guide would solve.
These are technical problems that generate back to the merchants not preparing proper feeds without errors and deliberate exploits. This thread is a step hopefully in helping us overcome this without waisting another year shouting, wouldnt it be great if.
RegardsNothing to see here...
-
02-02-05 #13
Banned
- Join Date
- Nov 2003
- Location
- Bucharest, Romania
- Posts
- 2,684
- Thanks
- 0
- Thanked 0 Times in 0 Posts
Fair enough, I did read it all through, it's just I know if I go out and buy a book I'll be sat reading it going huh, er wha? I'm more hands on with trial and error with a little guidance than reading hundreds of pages to get to step one.
I think it's admirable that someone wants to do what networks should be doing actively and when it works then people won't mind funding it.
However I just think there must be simpler ways of getting from A to B. For instance I'm setting one site up now and at the bottom of each product page will hopefully be a comparison section, now I'm of the age of plug and play, that's all I've known, indeed Frontpage is the furthest I've got design wise with admin panels now being fitted, when payments finally arrive I'll get people on the case. I have so many domains with tenth content, can't wait until they're all up and running.
Plug and play is what networks should be looking into. Surely it isn't that difficult to find a solution. I knew I was out of my depth repying to this post, however that was the point I think, there's still Bill and Bob and little Joe who haven't got past step one using current offerings.
-
02-02-05 #14
data muncher
- Join Date
- Sep 2004
- Location
- Berlin
- Posts
- 2,477
- Thanks
- 0
- Thanked 0 Times in 0 Posts
I suppose your thoughts are valid although slightly off topic, i think to bring it round to the post again. If i take a step back it was the same that made me want to use data feeds, when i got into plugging my datafeeds in (all be it an import) i found that they were awful, i then started to make tools to make them nice. I hadnt thought of it from a your perspective, but i imagine if the feeds were nice then it would make it a lot easier for new starters to get used to them, and even if they were no easier at least you wouldnt have to go through some of the problems we have.
Though it has to be said, there is always problem, all of the time. Our problems will just advance as we get bored or used to feeds and want to progress past that point.Nothing to see here...
-
02-02-05 #15
Registered User
- Join Date
- Jul 2004
- Location
- everywhere
- Posts
- 64
- Thanks
- 0
- Thanked 0 Times in 0 Posts
For each product, generate a new meaningful, relevant and unique product description for each different affiliate taking it, and I'll pay you something....
Forget the rest!
Thread Information
Users Browsing this Thread
There are currently 1 users browsing this thread. (0 members and 1 guests)


LinkBack URL
About LinkBacks
Reply With Quote
Bookmarks