Reload

Jumbune

Accelerated Hadoop based Analytics
  • english
  • french
  • italian

Contact us

Contact us for more information about Jumbune, professional services and product support.

Send Message

Release 1.1.0 (12 May 2017)

Features available
  • Resource Utilization Metering - This features helps organizations using multi-tenant cluster to charge their customers on the basis of resources consumed by them. Consumption is derived on the basis of resources i.e. V Cores and memory used with respect to different execution engines.
  • Relative pool utilization by users - It enables organizations to analyze the usage of different resources in queues by submitted users. It gives the visibility of the customers who are doing maximum utilization of queue and resources.
  • Queue Pool Utilization summary - This feature helps organizations to analyze the queue utilization by their customers monthly. It helps in re-defining the maturity level of queues, which in turn leads to appropriate resource utilization in Data Lake.
  • Data Cleansing - Data Cleansing module helps to classify data on the basis of user defined criteria’s. It detects irrelevant data from data set based upon data violation constraints and persist both data files on HDFS. This feature gives assurance, that this clean data when ingested will help the application to run fine at first shot.
Enhancement
  • Spark Recommendations - Optimal configurations like executor memory, executor cores, executor instances etc. are recommended for spark jobs.
  • Hybrid Cluster support - Jumbune is enhanced to support hybrid cluster i.e. comprising of some on-premise nodes and some compute nodes on cloud. This helps organization to analyze/optimize their jobs on newly built hybrid cluster.
  • Offline statistics capture - This feature enables Jumbune to capture offline statistics of cluster resources/queues. This feature benefits organization to monitor all the offline statistics on Analyze Cluster at any point of time.
  • Export - This feature is introduced on Analyze Cluster module to facilitate user in persisting all the queue/daemon statistics. It can be used as analysis tool to run/schedule new jobs in future, effectively.
Bug Fixes
  • Analyze Cluster - Reflect configuration change immediately for component “High Resource Usage Applications” on UI Showing capacity scheduler statistics instead of Fair scheduler, though Fair scheduler is configured Recommendation not getting generated for all nodes, though recommended configurations are done only for one node of cluster. "Queue Utilization" statistics are not getting captured for Fair Scheduler configuration, when “Background Process” is turned ON
  • Optimize Job - Optimize Job when ran with manual option shows graph only with one iteration Text Box to give recommendation shows time in sec and min both as example. Please correct it to show only minutes Legends should be shown parallel to graph on result page Optimize Job fails when ran with "Defined in FS" option Optimize Job fails when Capacity scheduler is configured with child queues Job is not getting tripped in the given maximum time Job are getting logged with Agent user and not with the user who submitted the job in real-time
  • Data Cleansing - On Result page ‘ROOT’ is coming as one straight line Data Quality Timeline : When job is submitted to execute every hour, no entry is getting logged in CRON file Data Validation: Module names shown on Resource Manager UI for executed jobs should be specific to running module i.e Data Cleasing for “Data Cleasing” module, Data Profiling for “Data Profiling” module Data Validation: "parameter" text box should be optional and tool tip should be updated to help users provide correct values.
  • Manage Cluster - “Alert tab” to update and show enable/disable in place of “true/false”.
  • Dashboard - Correct message for License to and License From in “License” section.
  • Deployment - Influx DB port is not getting configured with the port given on deployment time.
About Us
Reload

Jumbune, a product from Impetus Technologies, that helps to optimise and analyze Big Data applications running on enterprise clusters. It is built on open source and highly scalable with deep insights into performance of Hadoop applications and clusters.