Affiliate Marketing
Forum Search

Reply
 
LinkBack Thread Tools Display Modes

  #1 (permalink)  
Old 19-10-06
Registered User
 
Join Date: Feb 2006
Location: Gillingham
Posts: 503
Thanks: 0
Thanked 0 Times in 0 Posts
Donk is an unknown quantity at this point
  Relational or Flatfile Database?

I'm creating (yet another) product feed database script and I'm looking for ideas help from the forum.

I've searched through various websites for the answer but can't find one anywhere.

Is it better to use a flatfile or a relational database?

Most of the articles I've read refer to various gurus of the 70's and the benefits of the relational database. But is this still relevant to today? I remember 'back in the day' running an employees database on an 8086 machine running at 8mhz and stored on a 360k floppy disc. Things have changed considerably since then, computer speeds have in increased by a factor of 500 and storage by a factor of hundreds of thousands. But the old '70s theories still prevail are they still relevant?

The main table I have designed is shown in the following mysql dump.
Code:
CREATE TABLE `products` (
  `Merchant_ID` int(7) NOT NULL default '0',
  `Product_ID` varchar(30) NOT NULL default '0',
  `Name` varchar(100) NOT NULL default '',
  `Brand` varchar(30) default NULL,
  `Short_Description` varchar(255) NOT NULL default '',
  `Long_Description` text,
  `Product_URL` varchar(255) NOT NULL default '',
  `Image_URL` varchar(255) NOT NULL default '',
  `Thumbnail_URL` varchar(255) NOT NULL default '',
  `RRP` decimal(8,2) NOT NULL default '0.00',
  `Offer_Price` decimal(8,2) NOT NULL default '0.00',
  `Promo_Text` text,
  `Delivery` varchar(255) default NULL,
  `Category_ID` int(6) NOT NULL default '0',
  `Delete_Flag` tinyint(1) NOT NULL default '0',
  `currency` char(3) NOT NULL default 'GBP',
  PRIMARY KEY  (`Product_ID`),
  KEY `Name` (`Name`),
  KEY `Category_ID` (`Category_ID`),
  KEY `Merchant_ID` (`Merchant_ID`)
) ENGINE=MyISAM ;
(I also have separate category tables for various networks linked on the Category_ID and a merchant table).

If I wish to add industry specific data (eg holidays cellphone cds etc) would I
be better off adding the relevant fields to the main table or would it be better to have a separate table for each industry link on the Product_ID?

Any advice / help would be appreciated.

Regards
__________________
They came for my 404 and I said nothing
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Sponsored Links
  #2 (permalink)  
Old 19-10-06
pricethat's Avatar
data muncher
 
Join Date: Sep 2004
Location: Berlin
Posts: 2,485
Thanks: 0
Thanked 0 Times in 0 Posts
pricethat is an unknown quantity at this point
You opened up a whole can of worms, it is good to really put some thought into a database structure, i think you relational database thing is neither here nor there nowadays. So just carry on using mysql the way that everyone else does but onto your table structure.

There is no right or wrong way, you really do have to make your own mind up depending on what it is you are trying to do and if you are trying to have lots of things then one setup will never be as good as another,

you say one table over several tables, how do you seperate products into different tables when one merchant sells thousands of items across 6 different product types? how can you seperate them when the merchant doesnt have a category structure to identify them.

You want one table with lots of data, how much data, we run mysql with millions of products in one table but we did a lot of custom tuning with both the server and mysql to get it to work/behave. What about indexes?

For some jobs it would be better to have each field as a seperate table but then you have more strain on your machine doing searches in one table to get id's and then going to other tables to get the information (double the queries) so you see the list is endless.

There are some clever ways that you can have one table for ALL types of data, it does not matter what product groups there are but that is probably not the kind of programming you would get a lot of support with on forums and programmers as they probably would not understand what you are doing and why.

Everything depends on everything, processor, hard drive speed, memory buffers, caches, tuning so you have to know what it is sat on and how much it will be used and if your designing a script for others to use then you need to presume that they will all be on shared hosting all with timeout limits on processes and with very little memory so it makes the job even harder, you cant have large tables because there is not enough resource and you cant have many tables because mysql will have too many tables open in memory.

I am not going to get into database consultation as its costs us many thousands of pounds to get where we are with ours and i would say its only now we really really understand how to push mysql hard like a ***** but if you are starting off i would say just put it all in one table and make some money, accept that the fact that it only becomes a problem when you are making money and then you have enough cash to fix the problem as they come about.

Also bear in mind that generally performance problems associated with large tables, providing the code is ok can generally be solved by just getting a bigger server or faster drives most of the time.
__________________
Nothing to see here...
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #3 (permalink)  
Old 19-10-06
Spartacus's Avatar
[yet-to-be] freed slave
 
Join Date: Sep 2006
Location: Sheffield
Posts: 209
Thanks: 0
Thanked 1 Time in 1 Post
Spartacus is an unknown quantity at this point
Use a relational database, such as MySQL. They're still as valid today as way back when. With reference to the performance issues mentioned above, If you're not looking for an industrial strength solution then number of tables may not be an issue.

If you performance is an issue then you would need consider denormalisation of your data, search indexes and other stuff that escapes me at the moment. Denormalisation creates redundancy and the potential for inconsistencies, so that’s where the design comes in.

Flat files are OK for noddy applications but when it comes to searching and sorting this can be a pain. Much easier to write a bit of MySQL. I think you can use SQL to query files (vague memory), but it certainly isn’t widely used.

There are also issues with security when using files, and you need to consider locking and have the same redundancy issues as denormalised tables.

In my view it’s better to dive in build yourself a logical data model, normalise if you can and use it as the basis of your table design. Once you have tried to right a few SQL statements with tricky joins in you’ll soon realise how far to go with normailisation.

Alternatively keep it simple and table numbers to a minimum.
__________________
http://www.tenuouslinks.co.uk (No SEO required)
http://www.rentagaff.co.uk (Search2Let)

Last edited by Spartacus; 19-10-06 at 02:32 PM..
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #4 (permalink)  
Old 19-10-06
Spartacus's Avatar
[yet-to-be] freed slave
 
Join Date: Sep 2006
Location: Sheffield
Posts: 209
Thanks: 0
Thanked 1 Time in 1 Post
Spartacus is an unknown quantity at this point
Quote:
Originally Posted by Donk View Post
If I wish to add industry specific data (eg holidays cellphone cds etc) would I
be better off adding the relevant fields to the main table or would it be better to have a separate table for each industry link on the Product_ID?
If it's a one to one relationship with the rest of the data on the table add it to the main table.
__________________
http://www.tenuouslinks.co.uk (No SEO required)
http://www.rentagaff.co.uk (Search2Let)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Sponsored Links
  #5 (permalink)  
Old 19-10-06
Registered User
 
Join Date: Feb 2006
Location: Gillingham
Posts: 503
Thanks: 0
Thanked 0 Times in 0 Posts
Donk is an unknown quantity at this point
Thanks for your detailed and timely reply, there is a lot of information to take in.

As you say to really produce a large site requires a dedicated server, also to manage a feed with millions of products requires more resources than just a 'one man band'.

I am well aware too of the problems of shared hosting. I have been using two hosts. Up until yesterday one was cheap (used for many small sites) and the other was efficient (used for one larger site). Due to the vagaries of shared hosting and the hosts trying to 'optimise their ROI' today It would appear that the only way I can diferentiate the hosts now is that one is cheap and the other isn't.

With my present limited resources I think will have to stick with the smaller sites for the time being but at least have the database and coding in place which will allow me to expand without having to start again.

Spartacus my question wasn't teally to do storing the files as csv/text but how they are used in MySql. It would appear from scripts available on the web that the majority of users do not use the SQL capabilities of the MySql engine but rely on the coding in PHP to create the relationships.

The use of one table for all types of products I think would create a fairly cumbersome table. I was looking at a mobile phone feed which had about 20 extra fields that are specific to the phone industry, also a holiday feed witha about 10 extra. For CDs you would need to add artist, genre, tracks etc. then add to that mortgages, loans, betting etc the list goes on. My main table would grow from the present 20 (approx) fields to a hudred or more many of which would require indexing.

I need to consider all the different factors before I commit to much time into what has become a major project.

Thanks for your input
__________________
They came for my 404 and I said nothing
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #6 (permalink)  
Old 19-10-06
pricethat's Avatar
data muncher
 
Join Date: Sep 2004
Location: Berlin
Posts: 2,485
Thanks: 0
Thanked 0 Times in 0 Posts
pricethat is an unknown quantity at this point
Quote:
Originally Posted by Donk View Post
My main table would grow from the present 20 (approx) fields to a hudred or more many of which would require indexing.
Thats what i was talking about when being clever about things, you dont need hundreds of fields just for every different eventuality and feature type, some things remain a constant throughout the datas, colour for example, you dont need to add a field just for colour just because it is important for one product type, make one field called attributes

Colour:Value,height:value,width:value,depth:value, weight:value, etc etc etc

You dont need to have all of that in each field but you can get all the information into one field and then just deal with one field looking for strings within it. You talk about keeping small but i seem to be able to manage a few million queries a day in this manner.

Also that allows you to run searches for "colour:red" inside that field quite easily when you want to find red items etc. just the same as if you had a dedicated field.

Ok i am not suggesting that this is one of the perfect ways to do it but it is a good example of thinking more rather than thinking that 1 field should just have 1 bit of info that is normalised in it.
__________________
Nothing to see here...
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #7 (permalink)  
Old 19-10-06
Super Moderator
 
Join Date: Aug 2003
Posts: 2,451
Thanks: 0
Thanked 0 Times in 0 Posts
Rich is an unknown quantity at this point
I'd definately go with MySQL over flat file just because it is so much more flexible and forgiving (though just throwing faster servers at a problem is rarely the best move!). Flat file formats can perform much faster than MySQL but you really need to know what route you will be taking before starting constructing them.

There was a series on Oreilly back in May (thanks Google desktop for finding it!) that discussed 'Database war stories' at companies like Bloglines, Flickr, Findory. I haven't reread them now but I think they were quite good. This links to the one about Findory, that uses a combination of the two, and that links to the other articles in the series.

I pretty much always use MySQL, it's just so much easier - though I have been pulling my hair out for the last couple of days on one select which is now 360 lines long.
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
Sponsored Links
  #8 (permalink)  
Old 19-10-06
Spartacus's Avatar
[yet-to-be] freed slave
 
Join Date: Sep 2006
Location: Sheffield
Posts: 209
Thanks: 0
Thanked 1 Time in 1 Post
Spartacus is an unknown quantity at this point
Quote:
Originally Posted by Donk View Post
The use of one table for all types of products I think would create a fairly cumbersome table. I was looking at a mobile phone feed which had about 20 extra fields that are specific to the phone industry, also a holiday feed witha about 10 extra. For CDs you would need to add artist, genre, tracks etc. then add to that mortgages, loans, betting etc the list goes on. My main table would grow from the present 20 (approx) fields to a hudred or more many of which would require indexing.

I think I understand you better now and agree with your approach. Just FYI I tend to use an object-oriented aproach. So I would have say a CD object which is stored on a CDs table, such that each attribute of the object represents a column on the table. I suppose if I had a cassette object that had mostly the same fields I would consider using a generalised audio odject and use a single table to store both types of data. You know you've gone wrong when there are null values everywhere.

So when you refer to flat files do you mean a single record/row on the table?

Good luck anyway
__________________
http://www.tenuouslinks.co.uk (No SEO required)
http://www.rentagaff.co.uk (Search2Let)

Last edited by Spartacus; 19-10-06 at 04:45 PM.. Reason: Reread original post
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #9 (permalink)  
Old 19-10-06
Spartacus's Avatar
[yet-to-be] freed slave
 
Join Date: Sep 2006
Location: Sheffield
Posts: 209
Thanks: 0
Thanked 1 Time in 1 Post
Spartacus is an unknown quantity at this point
Quote:
Originally Posted by Donk View Post
It would appear from scripts available on the web that the majority of users do not use the SQL capabilities of the MySql engine but rely on the coding in PHP to create the relationships.
Just one more thing ...

I agree with this to. Try to go this way myself as it suits my OO approach. Sometimes however you just can't get round it. Suppose you had supplier records on another table an the supplier went bust , so you set the supplier to suspended. It could be useful then to join the products table to the supplier table in order to determine which products to no longer display at the time of the query.
__________________
http://www.tenuouslinks.co.uk (No SEO required)
http://www.rentagaff.co.uk (Search2Let)
Digg this Post!Add Post to del.icio.usBookmark Post in TechnoratiFurl this Post!
Reply With Quote
  #10 (permalink)  
Old 19-10-06
Registered User
 
Join Date: Feb 2006
Location: Gillingham
Posts: 503
Thanks: 0
Thanked 0 Times in 0 Posts
Donk is an unknown quantity at this point
I didn't think I'd get so many replies to this question thanks all for the input.

Quote:
Originally Posted by Rich View Post
I pretty much always use MySQL, it's just so much easier - though I have been pulling my hair out for the last couple of days on one select which is now 360 lines long.
Just a suggestion but I've just a completed a small site (20k records max) for a customer and dragged in the whole table in one go.
PHP Code:
$productarray[]="";
 
$result mysql_query($sql) or die("Query failed&nb