Friday, March 30, 2012

Is this too much data for SQL?

I need to log data collected from instruments in a laboratory at a
very high rate. I need to log timestamps and values measured from
various devices at a rate as high as once a second.
This means my table will grow to tens or hundreds of gigabytes within
a year. I want to know if SQL server will be able to handle this much
data. In addition, is it better to separate this data into different
tables? Does a table become hard to search (search by the timestamp
field I am logging) once it gets to a certain size?

Does anyone have experience or advice aboutt his problem? Is there
possibly another product out there which is better suited for logging
time-based data like this than SQL Server?

Thanks"NewbieNewsGrouper" <newbienewsgrouper@.hotmail.com> wrote in message
news:1e253672.0406161118.692433da@.posting.google.c om...
> I need to log data collected from instruments in a laboratory at a
> very high rate. I need to log timestamps and values measured from
> various devices at a rate as high as once a second.
> This means my table will grow to tens or hundreds of gigabytes within
> a year. I want to know if SQL server will be able to handle this much
> data. In addition, is it better to separate this data into different
> tables? Does a table become hard to search (search by the timestamp
> field I am logging) once it gets to a certain size?
> Does anyone have experience or advice aboutt his problem? Is there
> possibly another product out there which is better suited for logging
> time-based data like this than SQL Server?
> Thanks

I don't have any personal experience of applications like the one you're
describing, but there are certainly SQL Server databases which are handling
those volumes of data and transactions:

http://www.microsoft.com/sql/techin...scalability.asp
http://www.tpc.org/tpcc/results/tpc...erf_results.asp

It is possible to partition data using partitioned views - you can find more
details in Books Online.

Simon|||NewbieNewsGrouper (newbienewsgrouper@.hotmail.com) writes:
> I need to log data collected from instruments in a laboratory at a
> very high rate. I need to log timestamps and values measured from
> various devices at a rate as high as once a second.
> This means my table will grow to tens or hundreds of gigabytes within
> a year. I want to know if SQL server will be able to handle this much
> data.

It will. And once a second is not going to put SQL Server to the test.

With one caveat: you need to have a reasonable table and index structure.
Now, how that table and index structure should look like, I cannot tell,
because I don't know your application. It may also depend on you will
use that data. Maybe once the data has been collected, you should
move it over to Analysis Services where you can build cubes, dimensions
and whatever.

So the answer is, yes SQL Server can do it, but you have to design
carefully.

> In addition, is it better to separate this data into different
> tables? Does a table become hard to search (search by the timestamp
> field I am logging) once it gets to a certain size?

Again, that depends on your indexing. But depending on your requirements,
partitioning may be a good idea. For instance, if many queries will be
against today's test, it can be helpful to have them in a separate
table.

--
Erland Sommarskog, SQL Server MVP, esquel@.sommarskog.se

Books Online for SQL Server SP3 at
http://www.microsoft.com/sql/techin.../2000/books.asp|||I once supported a similar application which recorded lab measurement data
to SQL Server. The application architecture was such that data was
initially inserted to staging tables and then imported into a separate
schema for reporting and analysis. It is common to segregate operational
and reporting data so that you can optimize your schema (including indexes)
for the different requirements.

As Simon and Erland said, the key to performance with large tables is
appropriate indexing. SQL Server can certainly handle VLDBs but be aware
that performance is ultimately constrained by your hardware speed.

--
Hope this helps.

Dan Guzman
SQL Server MVP

"NewbieNewsGrouper" <newbienewsgrouper@.hotmail.com> wrote in message
news:1e253672.0406161118.692433da@.posting.google.c om...
> I need to log data collected from instruments in a laboratory at a
> very high rate. I need to log timestamps and values measured from
> various devices at a rate as high as once a second.
> This means my table will grow to tens or hundreds of gigabytes within
> a year. I want to know if SQL server will be able to handle this much
> data. In addition, is it better to separate this data into different
> tables? Does a table become hard to search (search by the timestamp
> field I am logging) once it gets to a certain size?
> Does anyone have experience or advice aboutt his problem? Is there
> possibly another product out there which is better suited for logging
> time-based data like this than SQL Server?
> Thanks|||It is usualy the standard to have your historic data sent to a seprate
database. At my company, we have a huge database with millions of
transactions. Right now, the database is well over 350GB (note hoever
this is on a million dollar 64 processor machine processing millions
of records a day).

Each evening, a DTS job runs which copies specific data (based on
timestamps) out of the production database, and into a datawarehouse.
This server can then be accessed offline. The benifit to this
alterntive database is 1) its not getting the busy hits that the
prodution system is getting. 2) You can have the DTS transfer the data
in a way that is benificial to any reports you may want to run.
(submit the data in a certain order, or into tables other than as are
structured in your production enviorment).

As far as size goes, You can pretty much go as big as you want,
assuming you have the disk space. If you are going to be collecting
data for over a year, I assume this data will be difficult to
reproduce, so you will want to back it up. The datawearhouse is an
execlent tool for this.

Hope this helps.
Mark|||I am currently engaged in pulling operatong system metrics from BMC
Patrol agents on over a 1000 sperate servers into one SQL database. I
have found a dramatic improvment in DB performance when inserting data
into a staging table and then aggregating data into reporting tables.
I currently process over 100 million records daily using a fully
loaded HP DL760 attached to an IBM FasTt SAN.|||In article <422d5636.0406271659.25c1a96f@.posting.google.com>, atyoung75
@.yahoo.com says...
> I am currently engaged in pulling operatong system metrics from BMC
> Patrol agents on over a 1000 sperate servers into one SQL database. I
> have found a dramatic improvment in DB performance when inserting data
> into a staging table and then aggregating data into reporting tables.
> I currently process over 100 million records daily using a fully
> loaded HP DL760 attached to an IBM FasTt SAN.

It's only to much data if you can't get the results you want.

--
--
spamfree999@.rrohio.com
(Remove 999 to reply to me)

No comments:

Post a Comment