Working of HashSet in Java
Java Set Interface
A Java Set interface represents a group of elements arranged like an array. It does not allow duplicate elements. When we try to pass the same element that is already available in the Set, then it will not store into the Set. It is used to model the mathematical set abstraction.
Java HashSet class
A Java HashSet class represents a set of elements (objects). It does not guarantee the order of elements. It constructs a collection that uses a hash table for storing elements. It contains unique elements. It inherits the AbstractSet class. It also implements the Set interface. It uses a technique to store elements is called hashing. HashSet uses HashMap internally in Java.
Suppose, we want to create a HashSet to store a group of Strings, then create the object as:
Where <String> is the generic type parameter. It represents the type of element storing in the HashSet.
HashSet implements Set interface. It guarantees uniqueness. It is achieved by storing elements as keys with the same value always. HashSet does not have any method to retrieve the object from the HashSet. There is only a way to get objects from the HashSet via Iterator. When we create an object of HashSet, it internally creates an instance of HashMap with default initial capacity 16.
HashSet uses a constructor HashSet(int capacity) that represents how many elements can be stored in the HashSet. The capacity may increase automatically when more elements to be store.
HashSet uses another constructor HashSet(int capacity, float loadfactor). Here, loadfactor demines the point where the capacity of HashSet would be increased internally. For example, the product of capacity and loadfactor is 101*0.5=50.5. It means that after storing 50th element into the HashSet; its capacity will be internally increased to store more elements. The initial default capacity of HashSet is 16. The default load factor is 0.75.
In the following, we are implementing add() method which adds element into HashSet.Test it Now
Set is [America, India, Russia] Elements using iterator: America India Russia
In the following example we are trying to add some duplicate values.Test it Now
Set is [China, America, India, Russia] Elements using iterator: China America India Russia
In the above example we have added some duplicate values. We can observe that duplicate values are not stored in the HashSet. When we pass duplicate elements in the add() method of the Set object, it internally returns false.
Here, a question arises that how it returns false. When we open the HashSet implementation of the add() method in Java APIs i.e. rt.jar, we find the following code in it:
In the above code a call to add(object) is delegated to put(key, value) internally. Where key is the object we have passed and the value is another object, called PRESENT. It is a constant in java.util.HashSet.
We are achieving uniqueness in Set internally through HashMap. When we create an object of HashSet, it will create an object of HashMap. We know that each key is unique in the HashMap. So, we pass the argument in the add(E e) method. Here, we need to associate some value to the key. It will associate with Dummy value that is (new Object()) which is referred by Object reference PRESENT.
When we add an element in HashSet like hs.add("India"), Java does internally is that it will put that element E here "India" as a key into the HashMap (generated during HashSet object creation). It will also put some dummy value that is Object's object is passed as a value to the key.
put method of HashMap
The important points about put(key, value) method is that:
When we invoke add() method in HashSet, Java internally checks the return value of map.put(key, value) method with the null value.
Retrieving Object from the HashSet
We use iterator() method to retrieve object from the HashSet. It is a method of java.util.HashSet class. It returns iterator for backup Map returned by map.keySet().iterator() method.