J2EE Search and Pagination Best Practices
Home Infrastructure Development Environment Architecture

This page captures the best practices I've come across for handling search and pagination style operations in web applications.


First my definitions of the terms

  • Search - the ability to do a simple search (based on a simple keyword) or an advanced search (search based on multiple fields).
  • Pagination - the ability to page through a large result set.

There are several ways of implementing search and pagination mechanisms in a web application. The following is based on a db-centric approach (you may also need to integrate a search engine such as lucene if you want to do a lot of full-text searches).

Let's say you want to provide advanced search capability for product entities based on product code, name or category.

Advanced Search Example

Product Code
Product Name
Category
 

One approach is to create the query dynamically based on which fields are entered by the end user. This strategy has its own set of problems including performance (the SQL needs to be parsed each time by the DB) , security (it yields itself to text manipulation attacks) and formatting (formatting dates etc. in db-specific ways can be a big hassle). The recommendation therefore is to use prepared statement queries as follows:

select product.id, product.code, product.name, prdoduct.category,.....
from product
where (? is null or product.code= ?)
and (? is null or product.name= ?)
and (? is null or product.category= ?)
order by product.code

where the end-user entered product code is passed to the first two placeholders, end-user entered product name is passed to the next two place holders etc.

Pagination

Read this article for a good introduction to the basics on pagination and options. I've seen many solutions to pagination, everything from no pagination (meaning a page can get populated with thousands of results) to caching the entire result set in the session to allow in-memory pagination! Other commonly seen (bad) implementations include the use of stateful session beans or http sessions that cache references to open result sets (with open connections yikes!). Such methods would work for a sample application but not for a scalable real world application.

Following is a very generic solution to pagination (even for queries with complex ordering - this solution is also explained in the above article). Limiting the results page to the page size is achieved by using the JDBC statement's setMaxRows method. (Even Hibernate provides similar functionality)

The idea is to make sure that your query provides an unambiguous sort order and has sufficient columns to uniquely identify your result rows. You can obtain the next and previous pages by addiing the relevant filters to the where clause.

Original select statement:

Let's say the original select statement you want to page through looks as follows:

select col2, col3, col4...
where col4 = ?
order by col2 asc, col3 desc

Let's say that col2 and col3 does not constitute a unique row identifier(worst case). In such a case, you will have to add a field to the above query that provides a unique row id and also include that field (let's say col1) in the sort order. So the query becomes:

select col1, col2, col3, col4...
where col4=?

order by col2 asc, col3 desc, col1 asc

 

Sample Output
col1 col2 col3
1 A 5
2 A 4
3 A 4
1 B 9
2 B 8
3 B 4

The row id for this query is (col2, col3, col1) defined by the sort columns. In the queries below, i'm using p_col1, p_col2 and p_co3 to refer to the column values of the last rows of the previously referred page. You can use the results above with a pagesize=2 to walk through the queries below:

Query to fetch the next page:
The query that you will send from any page, to get to the next page is as follows:

select col1, col2, col3, col4...
where col4=?
and ( ( col2 > :p_col2)
or (col2= :p_col2 and col3< :p_col3)
or (col2=:p_col2 and col3=:p_col3 and col1>:p_col1)
order by col2 asc, col3 desc, col1 asc

Note how the greater than or smaller than signs correspond to the sort order for that field.

Query to fetch the previous page:

select col1, col2, col3, col4...
where col4=?
and ( ( col2 < :p_col2)
or (col2= :p_col2 and col3> :p_col3)
or (col2=:p_col2 and col3=:p_col3 and col1<:p_col1)
order by col2 desc, col3 asc, col1 desc

The rows will be returned in the reverse order when fetching the previous page and will have to be re-ordered within the java code. It's also good practice to introduce DB indexes to the sort columns. Ofcourse, if your sort ordering and row id's are simple, the modified queries will also be much simpler.

 

 


©2004 Chandika N. Mendis  
Hosted by www.Geocities.ws

1