2EE and .NET are two comprehensive frameworks, especially when you consider their data access layers. JDBC (Java DataBase Connectivity) on one side, and ADO.NET (evolution of ActiveX Data Objects) on the other side are two efficient and flexible APIs that enable us to connect to many different data storage systems. In this article, we'll focus on the usage of those APIs to access relational data.

After a short reminder of each technology historic, we'll review different ways to interact with databases, and we'll ask the following recurring questions :

We won't write about object/relational mapping here (on this topic, you can consult Sami Jaber's articles on ObjectSpaces and competitors), but only about the data access API features.

Once upon a time, JDBC...

JDBC is born in 1996 along with JDK 1.1. Even at that time, this API enabled to connect to any relational database, supposing a connection "driver" existed for this database. Indeed, JDBC is only a programming interface aiming at interacting with a database. In order to really use it, it is mandatory to own a specific implementation of JDBC. That's what we call a Driver.

Unavoidable JDBC drivers types

There are 4 different JDBC drivers types :

Database vendors generally provide a JDBC driver (generally a Type 2 or 4). But sometimes, performance or ease of deployment considerations may lead you to buy a third party driver, developed by firms expert in this domain. Have a look at Sun web site to get an exhaustive list of existing JDBC drivers (cf resources).

JDBC versions and main features

JDBC 1.0 (born in 1996) : the first version of the API enables of course to connect to a database, to run SQL queries and to handle query results. But we have to admit that JDBC 1.0 contains serious gaps, which have been filled in by subsequent versions.

JDBC 2.0 (born in 1998) : the API now splits into two parts named "core" and "optional package". In addition to basic features the previous version already proposed, we find in JDBC 2.0 :

JDBC 3.0 (born in 2002) : this last version can mask some proprietary parts of relational databases engines and can standardize some advanced optimizations. More specifically, JDBC 3.0 offers the possibility to :

 

To conclude, let's say that (unfortunately) all drivers don't support all the features we just spoke about. It will take some time till JDBC drivers developers can implement all JDBC 3.0 optimizations. Except for this, everybody will admit that JDBC is a sharp and flexible tool that enables any Java application to connect to any relational database. And even if we don't mention it here, do know that JDBC can also interact with other storage systems such as spreadsheets or structured files...

Once upon a time, ADO.NET...

Microsoft data access APIs historic

The very first database connection interface in Microsoft environments is of course ODBC (Open DataBase Connectivity), which appeared before JDBC. This generic interface also enables an application (developed in Delphi, Python, VB or VC++ at that time) to connect to any database as soon as an ODBC driver exists for this it.

ODBC is a pretty low level interface. It is not very easy to use, especially in Visual Basic. This need lead Microsoft to introduce RDO (Remote Data Object) so that VB can easily communicate with a relational database. RDO is in fact a layer on top of ODBC, and enables VB to connect to a local or a remote database.

RDO has never been made accessible to VC++ (via Microsoft Foundation Classes), hence developers in this language had to wait for DAO (Data Access Object) to see the integration of data access classes into MFC. But initially, DAO only allowed applications to connect to a Microsoft Access database (via the Jet engine). Knowing DAO success, it was a wise decision to extend the list of databases DAO could access; the technique was simple : DAO can connect either to the Jet engine or to a standard ODBC driver via the ODBCDirect bridge.

Let's see... This makes a good deal of interfaces : ODBC, RDO, DAO... having the same use ! Then Microsoft decided to standardize data sources access with OLE-DB (Object Linking and Embedding - DataBase). This new API enables any language (OLE-DB is based on COM components, hence agnostic to programming languages) to connect to any kind of data source (relational databases, structured files, spreadsheets, message exchange systems). OLE-BD performance generally are far better than those of the previous APIs : to each data source corresponds an "OLE-DB provider", totally re-implemented for this new API and hence that doesn't inherit from its predecessor weaknesses. At worst, if no OLE-DB provider exist for a data source, we can use an OLE-DB / ODBC bridge.

Unfortunately, OLE-DB is a COM API (Component Object Model), hence pretty low level. It has rapidly been simplified by ADO (ActiveX Data Objects), in which we can manipulate interfaces such as Connection, Command, and Recordset. ADO was available for every developer, whatever their programming language. Its philosophy was "connected", that's to say that it was typically made to maintain the connection between applications and databases while dealing with SQL queries results.

With the advent of .NET, the standard data source connection API becomes ADO.NET, initially called ADO+. In this API, the philosophy is more "disconnected" : applications execute their queries on the database, take back a copy of the results, and immediately free the connection before handling data. We'll come back on this style in the next sections.

ADO.NET providers kinds

Like JDBC, ADO.NET is only a programming interface. In order to connect to a data source, we have to get an implementation of this interface. That's what we call a ".NET Data Provider".

Good news ! 2 different .NET Data Providers are included in the .NET framework by default :

If your data source can only be accessed by ODBC (Excel for example), note that there is also an ODBC .NET Data Provider, freely downloadable on Microsoft's Web Site (cf References).

However, we have one little regret regarding this data access API design : an application code using the SQL Server provider isn't the same as the one using OLE DB. More specifically, there are as many class hierarchies as .NET data providers ! A little sample :

// Using the OLE DB .NET Data Provider
OleDbConnection cnx = new OleDbConnection("..."); 
cnx.Open();
OleDbCommand cmd = new OleDbCommand ("...", cnx); 
OleDbDataReader reader = cmd.ExecuteReader();  

// Using the SQL SERVER .NET Data Provider
SqlConnection cnx = new SqlConnection("..."); 
cnx.Open();
SqlCommand cmd = new SqlCommand ("...", cnx); 
SqlDataReader reader = cmd.ExecuteReader();
 

The bets are on that Microsoft will review the design of this API in future versions of the .NET framework and will use a Factory class as well as a set of abstract classes to reduce the coupling between user classes and the type of provider it uses to connect to the data source...

Well. Now that we know where ADO.NET and JDBC come from, we are going to review their respective features and to put in parallel Java and C# code samples showing those features.

Connection and query in the connected mode

Basic examples

Executing a SQL query using JDBC is trivial. Have a look :

import java.sql.*;
public class PremiereConnexion {
   public static void main(String[] argv){
       try{
           // Loading "Hypersonic SQL" database driver 
           Class.forName("org.hsql.jdbcDriver");
           
           // Opening the connection
           Connection cnx = DriverManager.getConnection
           ("jdbc:HypersonicSQL:hsql://localhost") ;
           
           // Executing a SQL query on this connection
           Statement stmt = cnx.createStatement();
           ResultSet rs = stmt.executeQuery("SELECT nom, prenom, age FROM Gurus");
           
           // Handling results
           while (rs.next()){
               String nom = rs.getString("nom"); // or rs.getString(1);
               String prenom = rs.getString("prenom");
               int age = rs.getInt("age");
               // Do something useful with data
           }
       } catch(Exception e){
           // Potential JDBC exceptions management...
       } finally {
           try{
               if (cnx != null){
                   cnx.close();
               }
           }catch(Exception e){
               // Handling the fact that the connection cannot be closed.
           }
       }
   }
}
PremiereConnexion.java



ADO.NET isn't outdone :

public class PremiereConnexion {
  public static void Main(string[] argv){ 
    // Connection configuration
    OleDbConnection cnx = new OleDbConnection
      ("Provider= SQLOLEDB;data source= PAR-GIL;initial catalog=DNG;"
      + "password= héhéhé;persist security info=True;user id=sa;" 
      + "workstation id=PAR-GIL;packet size=4096"); 
    cnx.Open();
    OleDbCommand cmd = new OleDbCommand 
     ("SELECT Nom, Prenom, Age FROM Gurus", cnx); 
    OleDbDataReader reader = cmd.ExecuteReader();  
    while (reader.Read()) {
      String nom = reader.GetString(0); 
      String prenom = reader.GetString(1); 
      int age = reader.GetInt32(2); 
      // Do something useful with data
    }
    cnx.Close();
  }
}
PremiereConnexion.cs

Optimization

JDBC : let's use a PreparedStatement

The preceding Java code isn't optimal : if you notice that you execute the same queries many times in the same program, it is recommended to use a PreparedStatement instead of a Statement. It will accept parameters that may vary from one invocation to the other. He added value of a PreparedStatement is that the SQL query interpretation and the execution plan in the database will only be done once even if we execute this query many times, which noticeably improves data access performance.

// Only the query execution chances.
// Everything else is the same
PreparedStatement stmt = cnx.prepareStatement
   ("SELECT nom, prenom FROM Gurus WHERE age < ?");
stmt.setInt(0, 26); // Equivalent to "WHERE age < 26"
ResultSet rs = stmt.executeQuery();

UtilisationPreparedStatement.java

You can notice that a PreparedStatement simplifies dynamic SQL queries design : we can replace ? by the values we want with clean invocation of dedicated methods such as "setInt(indice)". This is even true for String values : it is not necessary to double simple quotes contained in strings nor to manually handle special characters.

ADO.NET : take advantage of the Command object subtlety

The preceding C# code already used an intermediary object called the Command object (OleDbCommand here) which plays exactly the same role as JDBC' PreparedStatement. Hence, in ADO.NET as well, we can put parameters in the query that we will set afterwards in the program (before launching the precompiled query of course). ADO.NET code simplicity is comparable to JDBC :

// Only the query execution chances.
// Everything else is the same
OleDbCommand cmd = new OleDbCommand
   ("SELECT Nom, Prenom FROM Gurus WHERE age < ?", cnx);
cmd.Parameters.Add(new OleDbParameter("age", 26));
OleDbDataReader reader = cmd.ExecuteReader();  

UtilisationParametresCommand.cs

Stored Procedures

Of course, both JDBC and ADO.NET can execute stored and precompiled procedures on the database. We won't debate here about the never-ending argue between :

It is up to you to adopt a position (a compromise seems to be a wise decision as usual...). We'll only give you an example of how to trigger a stored procedure via our preferred interfaces. But first, here is the Transact-SQL code of the stored procedure we will invoke (compiled and stored in SQL Server 2000) :

CREATE PROCEDURE [dbo].[GetYoungGurus] 
@age  int  
AS
  SELECT Nom, Prenom FROM Gurus WHERE Age < @age
GO

GetYoungGurusProc.sql

Invoking this procedure using JDBC isn't a problem :

CallableStatement stmt = cnx.prepareCall("{call dbo.GetYoungGurus(?)}");
stmt.setInt(1, 26);
ResultSet rs = stmt.executeQuery();

InvocProcStock.java

Same thing in ADO.NET :

OleDbCommand cmd = new OleDbCommand("GetYoungGurus", cnx);
cmd.CommandType = CommandType.StoredProcedure;
cmd.Parameters.Add(new OleDbParameter("age", 26));
OleDbDataReader reader = cmd.ExecuteReader();

InvcProcStock.cs

Batch Updates

Still in the objective to optimize our database access code, note that Java applications can send a set of SQL queries at once to the database. The JDBC API calls this “Batch Update”. The principle is very simple, and well depicted by the following lines :

Statement stmt = cnx.createStatement();
stmt.addBatch("INSERT INTO Gurus VALUES ('Sami', 'Jaber', 28)");
stmt.addBatch("INSERT INTO Gurus VALUES('Thomas', 'Gil', 25)");
int [] updateCounts = stmt.executeBatch();

UtilisationBatchUpdate.java

ADO.NET doesn't directly propose an equivalent of the Batch Update, but you'll see in the following section that it is not penalizing since we'll be able to simulate it using a DataSet.

Disconnected mode

JDBC : Disconnected reading with RowSets

Java RowSets are JavaBeans (therefore serializable) that can be connected or disconnected (since JDBC 2.0). There may be as many RowSets as database vendors or as data access frameworks vendors, but Sun typically provides us with 3 RowSets by default :

In this list, only the CachedRowSet corresponds to a disconnected data access architecture. Here is a simple example of Java code manipulating this class :

Connection cnx = DriverManager.getConnection
  ("jdbc:HypersonicSQL:hsql://localhost");
Statement stmt = cnx.createStatement();
ResultSet rs = stmt.executeQuery("SELECT nom, prenom, age FROM Gurus");
CachedRowSet crset = new CachedRowSet();
crset.populate(rs);
crset.execute();

// Disconnect from the data source
rs.close();
stmt.close();
cnx.close();

// Disconnected data processing
while (crset.next()) {
  System.out.println(crset.getString("nom"));
}

UtilisationCachedRowSet.java

In this short Java code, you will have noticed that a RowSet stands for a record collection in our program memory. It is absolutely not connected to the data source we used. And to fill it, we just have to invoke « populate() ». In this example, we simply read information of the RowSet; we could have done that through a standard ResultSet and it would have been more efficient and less expensive in terms of memory occupation. But if the same data can be useful to our program again in the future, or if a complex data processing must be implemented in Java, our RowSet will act as a cache and will become very interesting because it will avoid many interactions between our application an the database.

ADO.NET : the DataSet object

The ADO.NET framework can also offers a disconnected data access mechanism; its central class is DataSet and globally behave as a RowSet. What is very interesting in the .NET platform is that everything is organized to easily manipulate a DataSet :


As you can see, the .NET framework is centered on this disconnected data processing object, the DataSet, even though it is still possible do handle query results on the fly through a DataReader.

In terms of programming, the code enabling to fill a DataSet is conceptually very close to the Java / RowSet equivalent. Have a look :

SqlCommand cmd = new SqlCommand();
cmd.CommandText = "SELECT Nom, Prenom, Age FROM Gurus";
cmd.Connection = cnx;

// A DataAdapter can execute queries on the database and fill 
// the DataSet with those queries results
SqlDataAdapter da = new SqlDataAdapter();
da.SelectCommand = cmd;

DataSet ds = new DataSet();   
da.Fill(ds);

foreach (DataRow row in ds.Tables[0].Rows){
  Console.WriteLine("Age du gourou : " +  row["age"]);
}

UtilisationDataSet.cs

Everything is done to simplify our lives. But wait, there's more : VisualStudio.NET also helps us to design typed DataSets, ie to shape DataSets for special cases. In our example, using a typed DataSet would enable both to improve (a bit) performance and to make our C# code clearer. Suppose we created a DataSet named “GuruDataSet” using VisualStudio.NET; our code becomes :

SqlCommand cmd = new SqlCommand();
cmd.CommandText = "SELECT Nom, Prenom, Age FROM Gurus";
cmd.Connection = cnx;

SqlDataAdapter da = new SqlDataAdapter();
da.SelectCommand = cmd;

// Typed DataSet : all tables and columns are available 
// through intellisense at coding time !
GuruDataSet gds = new GuruDataSet();
da.Fill(gds);
foreach (GuruDataSet.GurusRow row in gds.Gurus.Rows){
  Console.WriteLine("Age du gourou : " +  row.Age);
}

UtilisationDataSetType.cs

The beginning of the code doesn't change. But as soon as our GuruDataSet is filled, iterating over it becomes much more intuitive and more robust than in the previous example. In fact, the GuruDataSet can only contain (in our design) the "Gurus" table, whose columns are respectively "nom, prénom, age". Of course, designing a typed DataSet is fully graphical in VS.NET :

 

Since the DataSet is typed, developers cannot mistake any more on tables or columns names : they know at compile time if they misspell anything, which helps developing robust applications very rapidly.

How to update data using a RowSet / DataSet

Up to now, we used RowSet / DataSet to read data. It is the simplest and the more frequent usage. But it is also possible to modify data in memory, to add new records, to delete others, and of course to apply these modifications to the original data source. Let's do it in Java ...

// Moving to an absolute row number
// (relative moves are also possible)
crset.absolute(1); 

// In memory modification
crset.updateInt("age", 26); // Updating the guru's age
crset.updateRow(); 

// Inserting a new record
crset.insertRow();
crset.updateString("nom", "Roques");
crset.updateString("prenom", "Pascal");
crset.updateInt("age", 35);

// Deleting an existing record
crset.deleteRow();

// Updating the data source 
crset.acceptChanges();

ModificationRowSet.java

... and in C# (using a typed DataSet here) :

// In-memory modification
gds.Gurus[2].Age = 26; // Updating a guru's age
gds.AcceptChanges();

// Inserting a new record 
GuruDataSet.GurusRow row =  gds.Gurus.NewRow();
row.Nom = "Roques";
row.Prenom = "Pascal";
row.Age = 35;
gds.Gurus.AddGurusRow(row);

// Deleting an existing record
gds.Gurus[2].Delete();

// Updating the data
crset.acceptChanges();

ModificationRowSet.cs

But we have to pay attention to the fact that applying modifications to the data source is much more complex that those example may suggest. Indeed, we have to handle complex situations in which many people work in parallel on different DataSets / RowSets that stand for the same source data ! We won't enter those details here, but it is possible to be informed of data merging problems through exceptions, and to handle those errors programmatically.

Debate : connected or disconnected ?

Both JDBC and ADO.NET offer two data access strategies : one is connected (ResultSet or DataReader), the other disconnected (RowSet or DataSet). What is subtle here is to determine the criteria that will lead us to choose one or the other in each context...

Disconnected architectures

Simple architecture, read only data

Using a DataSet/RowSet in a system where simultaneous clients are few is possible (imagine light clients using ASP.NET or Servlets/JSP on the server side, or rich clients using WindowsForms or JavaSwing) when data access is read-only. In this case, there is absolutely no risk of incoherence or merge problem between different copies of data used in parallel by each client.

Using a DataSet/RowSet in this case is interesting for simplicity reasons : building a rich or a light client that displays the contents of a DataSet is incredibly simplified by VisualStudio.NET (everything can be done graphically !), and we can even handle pagination on a big DataSet.

Of course, the constraint will be memory consumption : if each client runs different queries, we will have to maintain as many DataSet/RowSet as simultaneous clients :

Complex architecture, read-write data

Our applications rarely limit their data access to read-only. In our disconnected mode hypothetic situation, if a client wishes to update the data she sees, she has two options :

Simple datamining, offline data processing

There is a situation maybe less intuitive or less often implemented in which our DataSet/RowSet can prove to be very interesting as well : a client who would wish to see data from different points of view. We can imagine a stock exchange addict (let's say a private trader) who wants to download day to day stock historic for each stock she plays with in order to do a technical analysis of their trends.

Her software could be implemented in WindowsForms and could download at 7 PM a DataSet that would contain all required data and that would calculate on the client side all the technical indicators that will help her to do her analysis (MACD, Stochastic, RSI, etc...). The user could ask her software to zoom on a specific period of the past, to compare a stock trend with an index such as the Dow Jones, or even to try to calculate forecasts in order to be able to decide whether she should buy or sell. The user could also want to display quotations as curves, Japanese candles, and so on... She could do all that stuff without having to stay connected to the network and without undergo any back and forth between client and server. The DataSet enables users to take a snapshot of stock data.

On the server side, it is easy to imagine a WebService that would send the DataSet in XML format back to the client software. This application would be very simple to setup.

 

Data cache anti-pattern

Let's imagine a Web site (say www.dotnetguru.org for example) on which crowds are enormous ( ;-) ). On this site, you can see some statistics such as rush hours, number of hits per week, per month... all those data being aggregated by a dynamic page. In order to improve the site responsiveness, the simplest thing to do wouldn't be to use a data cache but a page (or output) cache. If we consider ASP.NET, the OutputCache would be set to a certain timeout, so that data be close to reality. This way, not only the database wouldn't be stressed too much, but neither would the application : we could short-circuit the whole processing sequence and send back the same HTML page to user during 10 minutes.

Connected architectures

Simple architecture, but huge load

In the situation when simultaneous clients are very numerous, we have to pay attention to our systems memory consumption. Using a data cache (especially on the server side) would become prohibitive. Another argument against DataSet / RowSet is that the more simultaneous users, the more simultaneous data access. Merging caches to the original data source may put the whole system into jeopardy : many conflicts will rapidly imply a performance loss, and many errors sent back to user interfaces. Our software may not remain usable.

In this case, we need more flexibility regarding transactions management and memory consumption. Using a DataReader / ResultSet is recommended, and must be associated with advanced techniques such as stored procedures, connection pools and explicit transactions (that we'll see in the next sections). 

Of course in the real world, a detailed analysis must be lead in order to determine the best compromise.

Transaction management

JDBC and ADO.NET allow us to drive (very precisely) transactions while we run SQL code on our database. This supposes of course that transactions are supported by the target database; and as you know, there are some differences between transactions implementations, especially concerning transactional isolation.

In both APIs, transaction are managed in the scope of a connection. By default, connections set transaction management mode to auto-commit : each SQL query execution is done in a new transaction, which is closed as soon as the query returns. But for performance reasons or to isolate differently the database interactions, you will generally need to disable this automatic mode and to choose a manual commit mode. A short sample ?

In Java:

// Suppose the "cnx" connection is open
// We switch to "explicit transactions" mode
// and start a new transaction (this is implicit)
cnx.setAutoCommit(false);

// Execute a PreparedStatement in the scope of our transaction
PreparedStatement stmt = cnx.prepareStatement
("INSERT INTO Gurus (Nom, Prenom, Age) VALUES (?, ?, ?)");

try {
stmt.setString("Nom", "Jaber");
stmt.setString("Prenom", "Sami");
stmt.setInt("Age", 28);
stmt.executeUpdate();

stmt.setString("Nom", "Gil");
stmt.setString("Prenom", "Thomas");
stmt.setInt("Age", 25);
stmt.executeUpdate();

stmt.setString("Nom", "Roques");
stmt.setString("Prenom", "Pascal");
stmt.setInt("Age", 35);
stmt.executeUpdate();

cnx.Commit();
System.out.println("Les enregistrements ont été écrits en base.");
} 
catch (SQLException ex) {
System.err.println("SQLException: " + ex.getMessage());
if (cnx != null) {
 try {
  System.err.print("Aucun enregistrement n'a été écrit en base.");
  cnx.rollback();
 } 
 catch (SQLException excep) {
  System.err.print("SQLException: ");
  System.err.println(excep.getMessage());
 }
}
} finally {
   try{
       if (cnx != null){
           cnx.close();
       }
   }catch(Exception e){
       // Gestion de l'impossibilité de fermer la connexion
   }
}

TransactionsExplicites.java

And in C# :

// Suppose the "cnx" connection is open
cnx.Open();

// Start an explicit transaction
OleDbTransaction trans = cnx.BeginTransaction();

// Execute a command in the scope of our transaction
OleDbCommand cmd = new OleDbCommand
("INSERT INTO Gurus (Nom, Prenom, Age) VALUES (?, ?, ?)", cnx);
cmd.Transaction = trans;

try
{
cmd.Parameters.Add(new OleDbParameter("Nom", "Jaber"));
cmd.Parameters.Add(new OleDbParameter("Prenom", "Sami"));
cmd.Parameters.Add(new OleDbParameter("Age", 28));
cmd.ExecuteNonQuery();  
cmd.Parameters.Clear();
cmd.Parameters.Add(new OleDbParameter("Nom", "Gil"));
cmd.Parameters.Add(new OleDbParameter("Prenom", "Thomas"));
cmd.Parameters.Add(new OleDbParameter("Age", 25));
cmd.ExecuteNonQuery();  

cmd.Parameters.Clear();
cmd.Parameters.Add(new OleDbParameter("Nom", "Roques"));
cmd.Parameters.Add(new OleDbParameter("Prenom", "Pascal"));
cmd.Parameters.Add(new OleDbParameter("Age", 35));
cmd.ExecuteNonQuery();  

trans.Commit();
Console.WriteLine("Les enregistrements ont été écrits en base.");
}
catch (Exception e)
{
trans.Rollback();
Console.WriteLine(e.ToString());
Console.WriteLine("Aucun enregistrement n'a été écrit en base.");
}
finally
{
cnx.Close();
}

TransactionsExplicites.cs

At the beginning of this article, we said that JDBC 3.0 enabled to set marks in transactions, what we call « SavePoints », that support a partial cancel of the actions encompassed by a transaction (a partial rollback()). Here is how to use this feature (reusing the same code as previously) :

cnx.setAutoCommit(false);
PreparedStatement stmt = cnx.prepareStatement
("INSERT INTO Gurus (Nom, Prenom, Age) VALUES (?, ?, ?)");

// We omit exceptions management here
stmt.setString("Nom", "Jaber");
stmt.setString("Prenom", "Sami");
stmt.setInt("Age", 28);
stmt.executeUpdate();
// Let's put a SavePoint
Savepoint svpt1 = cnx.setSavepoint("SAVEPOINT_1");

stmt.setString("Nom", "Gil");
stmt.setString("Prenom", "Thomas");
stmt.setInt("Age", 25);
stmt.executeUpdate();
// Let's put another SavePoint
Savepoint svpt2 = cnx.setSavepoint("SAVEPOINT_2");

stmt.setString("Nom", "Roques");
stmt.setString("Prenom", "Pascal");
stmt.setInt("Age", 35);
stmt.executeUpdate();

// Partial cancel of modifications we made
cnx.rollback(svpt1);
System.out.println("Seul le premier enregistrement a été écrit en base.");

// Validate the modifications we made before the SavePoint
cnx.commit();
cnx.setAutoCommit(true);

UtilisationSavePoints.java

And without any surprise, we find the same feature in ADO.NET :

OleDbTransaction trans = cnx.BeginTransaction();
OleDbCommand cmd = new OleDbCommand
("INSERT INTO Gurus (Nom, Prenom, Age) VALUES (?, ?, ?)", cnx);
cmd.Transaction = trans;

// We omit exceptions management here
cmd.Parameters.Add(new OleDbParameter("Nom", "Jaber"));
cmd.Parameters.Add(new OleDbParameter("Prenom", "Sami"));
cmd.Parameters.Add(new OleDbParameter("Age", 28));
cmd.ExecuteNonQuery();  
// Let's put a SavePoint
trans.Save("SAVEPOINT_1");

cmd.Parameters.Clear();
cmd.Parameters.Add(new OleDbParameter("Nom", "Gil"));
cmd.Parameters.Add(new OleDbParameter("Prenom", "Thomas"));
cmd.Parameters.Add(new OleDbParameter("Age", 25));
cmd.ExecuteNonQuery();  
// Let's put another SavePoint
trans.Save("SAVEPOINT_2");

cmd.Parameters.Clear();
cmd.Parameters.Add(new OleDbParameter("Nom", "Roques"));
cmd.Parameters.Add(new OleDbParameter("Prenom", "Pascal"));
cmd.Parameters.Add(new OleDbParameter("Age", 35));
cmd.ExecuteNonQuery();  

// Partial cancel of the modifications we made
trans.Rollback("SAVEPOINT_1");
Console.WriteLine("Seul le premier enregistrement a été écrit en base.");

// Validate the modifications we made before the SavePoint
trans.Commit();

UtilisationSavePoints.cs

Connections pools

When database access tools undergo a huge stress, it becomes prohibitive to systematically open and close connection between application code and databases. Indeed, establishing a connection is a costly process that runs the risk to become a bottleneck for our applications.

As usual in object oriented applications, a fine-grained management of the connections life-cycle will be implemented by a « pool », ie a set of connections to the database that are permanently maintained open, and that are dynamically (on demand) assigned to object that need database connections.

Connections pools are implemented in JDBC as well as in ADO.NET. However, beware of the fact that all JDBC drivers don't implement this part of the specification (or at least not completely). Depending on the requirements of your application, make sure that the driver you use really implements a connections pool, and that this pool is configurable enough for your needs.

Getting a JDBC connections pool thanks to JNDI

The general advice in Java is to externalize database configuration (and hence the connection pool settings) in a directory. Client applications will query the configuration from the directory via JNDI (Java Naming Directory Interface), which avoids incoherencies and gives the opportunity to move the database or to refine the settings without having any impact on client applications.

Getting a reference on a connection pool using JNDI is very simple :

// Directory root
Context ctx = new InitialContext();

// Get a reference on the pool (this is an objet of type DataSource)
DataSource ds = (DataSource)ctx.lookup("jdbc/GuruBD");
Connection cnx = ds.getConnection("tom", "héhéhé");

UtilisationPoolConnexions.java

As you can see, we end up with a 100% standard connection reference, from which we can orchestrate transactions, prepare statements and execute SQL queries. There is absolutely no impact on the rest of the code, except in terms of performance : when you invoke « cnx.close(); » the connection won't be physically closed but simply given back to the pool. It may be reused by another object that asks for a connection via « ds.getConnection(); ».

Of course, depending on the pool settings, we'll support more or less simultaneous connections, and it will be possible to parameterize the increment (the amount of connections the pool must be augmented when running out of connections), and the idle timeout (after which the pool will reduce its size by closing unused connections).

ADO.NET connection pool

The concept being exactly the same, we'll put the stress on technical differences between an ADO.NET pool and a JDBC DataSource.

In fact, using a connection pool is automatic in ADO.NET since OLE-DB and SQL Server providers that are included in the .NET framework support it transparently. How ? Pertty simply : by associating a connection pool to each connection string used by our applications ! Suppose you open two connections using the same connection string, your connections will be managed by a transparent pool :

string cnxString = "Provider=SQLOLEDB;Data Source=localhost;"
+ "Initial Catalog=DNG;Integrated Security=SSPI;";
OleDbConnection cnx = new OleDbConnection (cnxString);
cnx.Open();

OleDbConnection cnx2 = new OleDbConnection (cnxString); 
cnx2.Open();

OleDbConnection cnx3 = new OleDbConnection (cnxString);
cnx3.Open();

// Our pool contains 0 connection, the application uses 3 open connections

cnx3.Close();
cnx.Open();
cnx2.Close();

// Our pool contains 3 available connections, the application 0

cnx = new OleDbConnection (cnxString);

// Our pool contains 2 available connections, the application 1
// No connection was opened by the last line of code

UtilisationPoolConnexions.cs

By default, pools have an initial size of 0 connections and a maximum size of 100. You can modify these settings via the connection string itself :

cnx.ConnectionString = "Provider=SQLOLEDB;Data Source=localhost;"
+ "Initial Catalog=DNG;Integrated Security=SSPI;"
+ "Max Pool Size=50;Min Pool Size=10";

ParametresPoolConnexions.cs

Conclusion

JDBC and ADO.NET have comparable features. Both APIs evolved in parallel, they progressively introduced the same concepts (ResultSets or in-memory scrollable and modifiable RecordSets, RowSets or DataSets for disconnected architectures, the connection pool pattern, and the ability to manage transaction in a fine grained way...).

JDBC and ADO.NET don't impose any technical architecture. They both can be used :

We could even imagine that serializing an ADO.NET DataSet could be deserialized into a JDBC RowSet in order to reduce the gap between J2EE and .NET even more. Any candidate ?

 

Auteur : Thomas GIL

Translator : Thomas GIL

Copyright : DotNetGuru ©

 

Ressources

JDBC drivers search engine : http://industry.java.sun.com/products/jdbc/drivers

ODBC .NET Data Provider : http://msdn.microsoft.com/downloads