Automated negotiations based on learning models have been widely applied in different domains of negotiation. Specifically, for resource allocation in decentralised open market environments with multiple vendors and multiple buyers. In such open market environments, there exists dynamically changing supply and demand of resources, with dynamic arrival of buyers in the market. Besides, each buyer has their own set of constraints, such as budget constraints, time constraints, etc. In this context, efficient negotiation policies should be capable of maintaining the equilibrium between the utilities of both the vendors and the buyers. In this research, we aim to design a mechanism for an optimal auction paradigm, considering the existence of interdependent undisclosed preferences of both, buyers and vendors. Therefore, learning-based negotiation models are immensely appropriate for such open market environments; wherein, self-interested autonomous vendors and buyers cooperate/compete to maximize their utilities based on their undisclosed preferences. Toward this end, we present our current proposal, the two-stage learning-based resource allocation mechanism, wherein utilities of vendors and buyers are optimised at each stage. We are aiming to compare our proposed learning-based resource allocation mechanism with two state-of-the-art bidding-based resource allocation mechanism, which are based on, fixed bidding policy (Samimi, Teimouri, and Mukhtar 2016) and demand-based bidding policy (Kong, Zhang, and Ye 2015). The comparison is to be done based on the overall performance of the open market environment and also based on the individual performances of vendors and buyers.